Microbiome Helper 2 Setting up environments for analysis - LangilleLab/microbiome_helper GitHub Wiki

Authors: Robyn Wright Modifications by: NA

Note that this is still a work in progress! We don't guarantee that everything will work :)

Setup of AWS MH2-2025 image

Launched instance from previous CBW-ICG-2024 image.

Update conda:

conda update conda

Install latest QIIME2:

conda env create \
  --name qiime2-amplicon-2025.4 \
  --file https://raw.githubusercontent.com/qiime2/distributions/refs/heads/dev/2025.4/amplicon/released/qiime2-amplicon-ubuntu-latest-conda.yml

Got this warning:

For Linux 64, Open MPI is built with CUDA awareness but this support is disabled by default.                                                   
To enable it, please set the environment variable OMPI_MCA_opal_cuda_support=true before                                                       
launching your MPI processes. Equivalently, you can set the MCA parameter in the command line:                                                 
mpiexec --mca opal_cuda_support 1 ...                                                                                                          
                                                                                                                                               
In addition, the UCX support is also built but disabled by default.                                                                            
To enable it, first install UCX (conda install -c conda-forge ucx). Then, set the environment                                                  
variables OMPI_MCA_pml="ucx" OMPI_MCA_osc="ucx" before launching your MPI processes.                                                           
Equivalently, you can set the MCA parameters in the command line:                                                                              
mpiexec --mca pml ucx --mca osc ucx ...                                                                                                        
Note that you might also need to set UCX_MEMTYPE_CACHE=n for CUDA awareness via UCX.                                                           
Please consult UCX's documentation for detail.

Install fastqc and multiqc in QIIME2 environment:

conda activate qiime2-amplicon-2025.4
mamba install bioconda::fastqc
mamba install bioconda::multiqc

Update R:

sudo apt update
sudo apt install r-base

Remove previous environments:

conda env remove -n qiime2-amplicon-2024.2-backup
conda env remove -n picrust2
conda env remove -n biobakery3
conda env remove -n anvio-7
conda env remove -n rgi
conda env remove -n checkm
conda env remove -n functional
conda env remove -n taxonomic

Install PICRUSt2 from conda:

conda create -n picrust2-v2.6.2
mamba activate picrust2-v2.6.2
mamba install bioconda::picrust2

Install kneaddata:

mamba create -n kneaddata-v0.12.2
mamba activate kneaddata-v0.12.2
mamba install bioconda::kneaddata
mamba install bowtie2 #unnecessary as already installed
mamba install bioconda::trimmomatic
mamba install bioconda::trf
mamba install bioconda::fastqc
mamba install bioconda::multiqc
mamba install conda-forge::parallel

Install Kraken2:

mamba create -n kraken2-v2.14
mamba activate kraken2-v2.14
mamba install bioconda::bracken
mamba install bioconda::kraken2
mamba install conda-forge::parallel

It always says Kraken version 2.1.3, but it says this when installing:

Package    Version  Build             Channel        Size
─────────────────────────────────────────────────────────────
  Reinstall:
─────────────────────────────────────────────────────────────

  o kraken2     2.14  pl5321h077b44d_0  bioconda     Cached

Install Anvi'o

conda create -y --name anvio-8 python=3.10
conda activate anvio-8
mamba install -y -c conda-forge -c bioconda python=3.10 \
        sqlite=3.46 prodigal idba mcl muscle=3.8.1551 famsa hmmer diamond \
        blast megahit spades bowtie2 bwa graphviz "samtools>=1.9" \
        trimal iqtree trnascan-se fasttree vmatch r-base r-tidyverse \
        r-optparse r-stringi r-magrittr bioconductor-qvalue meme ghostscript \
        nodejs=20.12.2
mamba install -y -c bioconda fastani
curl -L https://github.com/merenlab/anvio/releases/download/v8/anvio-8.tar.gz \
        --output anvio-8.tar.gz
pip install anvio-8.tar.gz
mamba install bioconda::concoct
mamba install bioconda::metabat2
mamba install bioconda::maxbin2
mamba install bioconda::das_tool
mamba install bioconda::binsanity
mamba install bioconda::gtdbtk
mamba install usearch

#note that at some point I needed to downgrade scikit-learn to v 1.1.0 when I got pickle errors
#pip install scikit-learn==1.1.0

Install CheckM2:

mamba create -n checkm2 -c bioconda -c conda-forge checkm2

Install RGI (for CARD):

mamba create -n rgi-v6.0.4
mamba activate rgi-v6.0.4
mamba install bioconda::rgi
mamba install conda-forge::parallel

RStudio server:

sudo apt update 
sudo apt upgrade
sudo apt-get install r-base
sudo apt-get install gdebi-core
wget https://download2.rstudio.org/server/jammy/amd64/rstudio-server-2025.05.1-513-amd64.deb
sudo gdebi rstudio-server-2025.05.1-513-amd64.deb

Programs for getting a tree of metagenomic reads:

mamba create -n get_tree python=3.9
mamba activate get_tree
mamba install conda-forge::ete3
mamba install conda-forge::pandas

Programs for functional annotation of metagenomic reads with HMMs:

mamba create -n annotate_hmm
mamba activate annotate_hmm
mamba install anaconda::biopython
mamba install bioconda::clustalo
mamba install bioconda::hmmer
mamba install bioconda::raxml
mamba install bioconda::epa-ng
mamba install bioconda::gappa
mamba install conda-forge::r-castor
mamba install conda-forge::ete3
mamba install conda-forge::pandas