1B Install Mamba and Environments - NU-CPGME/sl_workshop_2024 GitHub Wiki

July / August, 2024

Developed by:
Egon A. Ozer, MD PhD ([email protected])
Ramon Lorenzo Redondo, PhD ([email protected])


Mamba

Mamba (a C++ implementation of the python-based Conda software) is a software package manager that allows you to easily install much of the software required for this workshop on your computer. Software is installed into "environments" that can be activated and deactivated from the command line. By setting up Mamba/Conda environments you can have multiple versions of the same software on one computer and avoid conflicts between different versions of software packages or incompatible software. This will also allow you to easily install software and any other pre-requisite programs that are needed to run that software in one step. Another major advantage of Mamba/Conda (and a reason while we'll be using it for this workshop) is that you can be sure that everyone is using the same version of each software application regardless of when they download it and what computer they are using (good for reproducibility).

For a nice introduction to the basic functionality and commands of Mamba/Conda (essentially the same for both), see this tutorial or here for more detail.

Step 1 - Download and install micromamba:

For this workshop we'll be using micromamba as it is very stippped down, easy to install, and fast version of Mamba.

Installing micromamba should be a snap: (This command automatically picks all the default options)

sudo apt install curl
"${SHELL}" < <(curl -L micro.mamba.pm/install.sh)
source ~/.bashrc

Step 2 - Set up bioconda

Enter the following commands, either one at a time or cut and paste all of them into your terminal. The order of the commands is important, though. For more information, check out Bioconda

micromamba config append channels defaults
micromamba config append channels bioconda
micromamba config append channels conda-forge
micromamba config set channel_priority strict

Note, the commands for adding channels and setting channel_priority are slightly different between micromamba and conda. The commands above are specific for micromamba and probably will give you an error in conda.

Step 3 - How to install packages in mamba environments:

Mamba environments can be set up by 1) manually creating a new environment and then adding software packages to the environment one at a time, 2) listing the software you want added to the environment when you create it, or 3) by using a specially formatted environment.yml file to create the environnment and install all the correct packages in one step. If you want more detail about setting up Conda and Mamba environments, take a look here.

Example Option 1: We could start by creating an environment we will name "ws_test1" using the -n option, activating the environment, and then installing a single software package called "circlator" from the bioconda channel (using -c) with its dependencies in that environment.

micromamba create -n ws_test1
micromamba activate ws_test1
micromamba install -c bioconda circlator

Circlator is a program that can be used to circularise genome assemblies that we're just using for demonstration purposes here.

When you are done using the environment, you can exit it using the micromamba deactivate command to return to your base environment.

Example Option 2: You can also create an environment and install software in the environment in a single step like so:

micromamba create -c bioconda -n ws_test2 circlator

New software can be installed in an existig environment by activating the environment and then using the micromamba install command.

As a warning, the more software packages you install in an environemnt, the more risk you'll have that conflicts will arise. Sometimes it can take a very long time for Mamba to resolve those conflicts or else it can't resolve them at all. Just be aware.

Example Option 3: Environment files that contain lists of packages and other information can be used to quickly create environments. These are formatted as .yml configuration files.

Here is an example of a simple environment file we could save as a text file named "environment.yml":

name: ws_test3
channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - circlator

To create an environment from the environemnt.yml file we would use the following command:

micromamba env create -f environment.yml

Other useful Mamba commands:

  • To get a list of all of your installed mamba environemnts: micromamba env list
  • To remove a mamba environment (like the ws_test1 environment we created above): micromamba remove -n ws_test1 --all

Create Workshop Environments

We are going to use software in mamba environments for several exercies in the rest of the workshop. Since they're relatively small and uncomplicated, we'll just create the environments manually here:


A. assembly Environment

micromamba create -y \
  -n assembly \
  -c conda-forge \
  -c bioconda \
  -c defaults \
  fastqc quast fastp spades

####In case of a weak internet connection, you may try the following codes in the order provided.

micromamba create -y -n assembly -c conda-forge -c bioconda -c defaults
micromamba activate assembly
micromamba install -y -n assembly -c bioconda fastqc
micromamba install -y -n assembly -c bioconda quast
micromamba install -y -n assembly -c bioconda fastp
micromamba install -y -n assembly -c bioconda spades

B. annotation Environment

micromamba create -y \
  -n annotation \
  -c conda-forge \
  -c bioconda \
  -c defaults \
  prokka=1.14.6
Click here for a note on MacOS installation > WARNING: If you are installing prokka on a Mac, the Conda version has been too buggy to use. Until it's fixed someday, you would have to use the following commands to install prokka on Mac: > >Using [HomeBrew](https://docs.brew.sh/): > >```Shell >brew install brewsci/bio/prokka >sudo cpan install Bio::SearchIO::hmmer >``` > >If you don't have HomeBrew: > >```Shell >sudo cpan Time::Piece XML::Simple Digest::MD5 Bio::Perl Bio::SearchIO::hmmer >git clone https://github.com/tseemann/prokka.git $HOME/prokka >$HOME/prokka/bin/prokka --setupdb >```

C. alignment Environment

micromamba create -y \
  -n alignment \
  -c conda-forge \
  -c bioconda \
  -c defaults \
  ivar snippy=4.6.0 snpeff=4
Click here for a note on MacOS installation > NOTE: On MacOS, the Conda version of snippy is broken. To install manually if you have [HomeBrew](https://docs.brew.sh/): > > ```Shell > brew install brewsci/bio/snippy > brew install brewsci/bio/vt > ``` > > If you don't have HomeBrew: > > ```Shell > git clone https://github.com/tseemann/snippy.git $HOME/snippy > echo "export PATH=$HOME/snippy/bin:\$PATH" >> ~/.bashrc > source ~/.bashrc > snippy --check > > ``` > > Then install just ivar in the mamba environment: > > ```Shell > micromamba create -n alignment -c conda-forge -c bioconda -c defaults ivar > ```

D. phylogenetics Environment

micromamba create -y \
  -n phylogenetics \
  -c conda-forge \
  -c bioconda \
  -c defaults \
  treetime iqtree mafft

GitHub

We are using Github right now for hosting the workshop documents and data, but Github's primary use is for software source code development and version control. Often newer versions of software will be available on Github than in Conda/Mamba or in other package managers such as APT. A folder containing source code and supporting documents for a piece of software is called a "repository."

Software or source code can be downloaded over the web from the Github website, but it's often more convenient to use the command line to get code from Github.

As an example, we're going to download the source code of Filtlong, a program for quality trimming long sequencing reads generated by Nanopore or PacBio platforms.

Below are the commands to "clone" (copy) the Github repository for Filtlong onto your computer and compile it.

mkdir ~/applications
cd ~/applications
git clone https://github.com/rrwick/Filtlong
cd Filtlong
make
bin/filtlong -h

Now we'll use git to download the data files we'll be using throughout the rest of the workshop.

cd ~
git clone https://github.com/NU-CPGME/sl_workshop_2024
cd ~/sl_workshop_2024

Using git and GitHub for software development and version control could be the topic of a whole other workshop. Just know they are very powerful tools for software and other development.



Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

⚠️ **GitHub.com Fallback** ⚠️