015: Singularity, part 2: anatomy of Singularity's recipe - McGranahanLab/Guidebook GitHub Wiki
Video tutorial accompanies this text tutorial. It is available here
Singularity is a program which packages (aka containerization) almost any software in a file and therefore makes it portable and reproducible.
- Portability means that a file with a software can be copied to any computer, server, workstation, HPC, etc and stay the same.
- Reproducibility means that given a singularity container with a software, results of that software execution on any computer, server, workstation, HPC, will stay exactly the same.
- Reproducibility. It's a common knowledge that sometimes results of a software execution are highly dependable on software versions (in the field of bioinformatics especially, but not exclusive). Despite that exact software versions may be noted in the methods, some users may have difficulties retrieving or installing it. Containers solve this issue effortlessly as a future user can just download a container, exactly the same one you performed your analysis with, and run it of their data. So, if done right, containers are the road to ultimate reproducibility.
- No installation hustle. With Singularity containers there is no struggles with software installation and version management. Containers are good to go a second after a download or creation. Need to test other software version? No problem! Just switch to another container, no need to worry about compatibility of various version on a computer/server/HPC. Containers also make you less dependent on cluster support team if you work on HPC as you won't need to bother them every time you project takes a need turn and you need to use yet another software.
Container is like a light, which has properties of both a wave and a particle: container can be viewed as a file, a software and a whole remote computer.
- Like a file, a container can be moved from folder to folder, copied, deleted, downloaded, uploaded, encrypted, etc.
- Like a software, a container harbors inside it a code which can be used to analyse data.
- Like a remote computer, a container has inside it a whole operating system (OS), i.e. Ubuntu. As to a remote server, it is possible to log in into it and interact with the installed software inside it.
Hopefully, by now all the advantages of software containerization with Singularity are well outlined and it is clear that containers are useful and indispensive part of future fully reproducible and easy to manage research. The question "how do I create a container and use it?" is open now. Let's dive into it!
To understand what is a Singularity Definition File (or recipe for short) an analogy of a container as a computer is very useful. Imagine you got a job as IT engineer and your company just bought 100 new shiny computers which you're tasked with configuring. You need to install exactly the same OS and software on all 100 computers and there is nothing on them to begin with. As a good IT engineer you will write a code containing a set of instructions which will install OS and all the software needed across the computers. This code is in essence a Singularity recipe.
So, Singularity recipe is a text file (it is also piece of code) which describes the container: which exact OS is at its base, which, where and how various software were installed, where are any helpful files or data bases, etc.
Since a recipe is just a text file, it can be put under version control like Git allowing to track any changes. It can also be spread across the community a bit easier than a container file itself and the container created each time from this particular recipe will be exactly the same (if the recipe is written well, of course). By convention, recipe files have a .def
extension, but you can use any extension you fancy, i.e. txt
.
Let's have a look at example of the simplest definition file:
Bootstrap: docker
From: ubuntu:20.04
%post
apt-get -y update && apt-get install -y python
%runscript
python -c 'print("Hello World! Hello from our custom Singularity image!")'
This recipe upon execution will produce a container which has naked Ubuntu v20.04 as OS and it will print Hello World! Hello from our custom Singularity image!")
upon launch.
In general, Singularity definition file can be divided into 2 core parts: header and sections.
- Header part is usually composed out of 2 lines with which recipe starts. In the example above it is
Bootstrap: docker
From: ubuntu:20.04
Header determines the operating system for the future container, which in turn controls what will already be pre-installed in the container without need for our intervention.
To continue the logic of example with IT engineer configuring 100 computers, header would represent the first thing an IT engineer does with the computer, which is OS installation.
Bootstrap
and From
here are key words, like name of parameters, and docker
and ubuntu:20.04
are the values for those parameters. The recipe should always start from the Bootstrap:
key word. In 99.9% of the cases line Bootstrap: docker
will be the
first line of your recipe. It says that the base OS (determined by From:
line) will be pulled (downloaded) from Docker Hub. You can also put Bootstrap: library
, and then the base OS will be pulled from Singularity Cloud. In the majority of the cases, they are interchangeable. There are more various values you can give to Bootstrap:
, you can read about them here.
The From:
line is quite powerful. Ordinary From: ubuntu:20.04
tells Singularity engine that you'd like to use "naked" Ubuntu v20.04 as a basis of your container, meaning that there will be nothing else, except Ubuntu v20.04 itself, installed. However, we can change value ubuntu:20.04
to something else to allow some weight being lifted for us. For example, if this value will be continuumio/miniconda3
it would mean that basis of our container will already have ubuntu + python 3 + conda installed and we won't need to worry about python or conda installation. Table below lists some useful values for From:
. Unfortunately, the full list of all possible values does not exist.
Value of From:
|
What is already inside | Note |
---|---|---|
ubuntu:14.04 |
'naked' Ubuntu 14.04 + Python 3.4.3 (no conda) | Unless you're 100% sure, it's better to use more up-to-date Ubuntu version |
ubuntu:18.04 |
'naked' Ubuntu 18.04, no python | Same as above |
ubuntu:20.04 |
'naked' Ubuntu 20.04, no python | Used in 50% of times |
continuumio/anaconda2 |
Debian GNU/Linux 10 + Python v2.7.16 + conda v4.7.12 | |
continuumio/miniconda2 |
Debian GNU/Linux 10 + Python v2.7.16 + conda v4.7.12 | More lightweight in terms of GB version of anaconda2. Should be proffered to continuumio/anaconda2
|
continuumio/anaconda3 |
Debian GNU/Linux 11 + Python 3.9.7 + conda v4.11.0 | |
continuumio/miniconda3 |
Debian GNU/Linux 11 + Python 3.9.7 + conda v4.11.0 |
Used in 49% of times. More lightweight in terms of GB version of anaconda3. Should be proffered to continuumio/anaconda3
|
rstudio/r-base:4.0-focal |
Ubuntu 20.04.4 + R v.4.0.5, no python | Check this link to see how specify your favorite R version |
ibmjava |
Ubuntu 18.04.6 + Java v8.0.7.6, no python | |
julia:1.3 |
Debian GNU/Linux 10 + Julia v.1.3.1, no python |
Header is the only essential part of the recipe.
A definition file which consists just out of header is valid. However, then the container is created there will be not much inside as can be deduced from the table above. You can create a test container just with a header, play with it, explore and see which programs comes in package with operating system (spoiler: not much).
In the example above you may have noticed lines starting with %
followed by commands:
%post
apt-get -y update && apt-get install -y python
%runscript
python -c 'print("Hello World! Hello from our custom Singularity image!")'
those are sections.
To continue the logic of example with IT engineer configuring 100 computers, sections would represent actual commands the engineer would use to download and install software, download and configure data base and so on.
Section(s) of a singularity recipe determines which/where/how software is installed, environmental variables, meta information, etc. To keep things neat, commands are divivded into sections with a specific purpose. As such,
%post
section tells which/where/how software is installed.%environment
keeps track of environmental variables and%labels
stores meta information (creator name and contact, etc) of the container.
Each section is defined by %
followed by name and a code black. Section names are predefined: %post, %environment, %labels, %files, %help, %runscript, %startscript, %test, %setup, %app. In this tutorial only some sections will be considered.
Order of sections in the recipe does not matter.
Example of a singularity recipe above is not representative. Therefore, let's switch to the other one. The recipe below is a working Singularity definition file for a widely used bioinformatics software samtools.
Bootstrap: docker
From: ubuntu:20.04
%post
# STEP 1 [REQUIRED] make installation not require user interaction
export DEBIAN_FRONTEND=noninteractive
apt-get update
apt-get upgrade -y
# STEP 2 [OPTIONAL] install some basic ubuntu/linux libraries
apt-get install -y wget curl unzip zip bzip2 tabix git \
gfortran perl gcc g++ make cmake build-essential \
software-properties-common autoconf
# STEP 3 [SOFTWARE - SPECIFIC] installation commands for a specific software
apt-get install -qq -y libncurses5-dev zlib1g-dev libbz2-dev liblzma-dev
wget https://github.com/samtools/samtools/releases/download/1.10/samtools-1.10.tar.bz2
tar -xf samtools-1.10.tar.bz2
cd samtools-1.10
./configure
make
make install
cd ../
# STEP 4 [JUST AN EXAMPLE] creation of folder, file and bash script
mkdir tiny_scripts
echo '#!/bin/bash' > tiny_scripts/say_cheese.sh
echo 'echo CHEESE!' >> tiny_scripts/say_cheese.sh
%environment
export PATH=$PATH:/samtools-1.10/bin:$PATH
PI=3.14
%labels
CREATOR Maria Litovchenko
ORGANIZATION TEST_WORKING_PLACE
EMAIL
VERSION v0.0.1
%help
Main software:
samtools v.1.10 http://www.htslib.org/
Example run:
singularity exec samtools_in_a_box.sif samtools --help
%post
section stores commands dedicated to software installation and file/database download and is the most important section. In essence, the absolute minimal meaningful recipe would consist of header and post section.
%post
section is written in bash
and contains all the same commands used during software installation on a machine running Linux/Ubuntu. In plain Ubuntu sudo
is often required to install a software system wide. In the %post
section sudo
is never used as while building a container root privileges are given by default. Let's have a closer look at the %post
section in the example recipe:
- In my practice I found that numerous software require some user input during installation, i.e. "Please type y if you agree for installation". During container creation, there will not be an opportunity to give any input. Therefore, all requests from the computer must be disabled and a default response to them given. This is exactly what
export DEBIAN_FRONTEND=noninteractive
command does.
# STEP 1 [REQUIRED] make installation not require user interaction
export DEBIAN_FRONTEND=noninteractive
- In step 2 some basic Linux/Ubuntu packages are installed. The listed packages are my personal preferences and may not be required dependencies for software of your choice. Yet, I find that they are quite useful and are in the list of required dependencies for some software. Therefore this command with some modifications appeared in the majority of my singularity recipes. Table below describes each package function.
# STEP 2 [OPTIONAL] install some basic ubuntu/linux libraries
apt-get install -y wget curl unzip zip bzip2 tabix git \
gfortran perl gcc g++ make cmake build-essential \
software-properties-common autoconf
Package | Function |
---|---|
wget curl
|
to download files from the internet |
unzip zip bzip2 tabix |
to zip/unzip/compress files |
git |
github |
gfortran perl |
languages, fortran and perl. Python and R are not installed here! |
gcc g++ make cmake build-essential |
compilers |
software-properties-common |
package manager |
autoconf |
automatically configure software source |
Ideally, during the installation process a specific version, i.e wget=1.20.3, of each package/software/program should be installed.
In practice, all software do not have specific versions of dependencies noted and therefore knowing in advance a package version could be difficult. There is a trick for such case: first, build a container without requiring specific versions, then write down versions of the installed in the container software and re-build the container with specific versions required. So, the fully reproducible version of step 2 would look like:
# STEP 2 [OPTIONAL] install some basic ubuntu/linux libraries
apt-get install -y wget curl \ # to download from internet
unzip zip bzip2 tabix \ # to zip/unzip/compress files
git \ # speaks for itself!
gfortran perl \ # languages, fortran and perl. Python and R are not installed here!
gcc g++ make cmake build-essential \ # compilers
software-properties-common \ # package manager
autoconf \ # automatically configure software source
ADVICE: Always use -y
option after apt-get install
, it sets a 'yes' response to any computer request.
- Step 3 is an actual installation of software,
samtools
in the case of the example recipe. This step is completely software - specific. Usually a software does provide commands for its installation in the guide. Those commands just needed to be copied in%post
section, omittingsudo
.
# STEP 3 [SOFTWARE - SPECIFIC] installation commands for a specific software
apt-get install -qq -y libncurses5-dev zlib1g-dev libbz2-dev liblzma-dev
wget https://github.com/samtools/samtools/releases/download/1.10/samtools-1.10.tar.bz2
tar -xf samtools-1.10.tar.bz2
cd samtools-1.10
./configure
make
make install
cd ../
- Step 4 is not a necessary step for
samtools
installation and functioning. It is just an example that one can create directories (tiny_scripts
) and files (say_cheese.sh
, it's even a script!) inside the container.
# STEP 4 [JUST AN EXAMPLE] creation of folder, file and bash script
mkdir tiny_scripts
echo '#!/bin/bash' > tiny_scripts/say_cheese.sh
echo 'echo CHEESE!' >> tiny_scripts/say_cheese.sh
%environment
section defines environmental variables in your container.
In case you're not familiar with the environmental variables here is a little example. It is very nice that you can just open your terminal, navigate to any folder, and then type samtools
and it will work, isn't it? But how does your system knows where to look for samtools
executable? Though environmental variable of course! Usually, on actual computer they are defined in .bashrc
file. This file is one of the first files your computer reads during booting and therefore each time you switch your computer on it knows where samtools
is. So, the common format for the environmental variable to define a path to binary executable is: export PATH=/where/to/install/bin:$PATH
. Obviously, environmental variables may not only define path to a certain software (or file), they can also define a variable, i.e. PI=3.14
.
In container, they are defined in environment section.
%environment
# example of environmental variable as path to samtools executable
export PATH=$PATH:/samtools-1.10/bin:$PATH
# example of environmental variable defining a global variable
PI=3.14
Important note: in container all environmental variables are accessible only after container is created, but not during its creation (not during execution of %post
section). So, if samtools
or PI
variable are needed during container creation, then they have to be defined %post
section as well. For example:
%post
export PATH=/where/to/install/bin:$PATH
PI=3.14
%labels section provides a meta data for the container.
Conventionally, one puts there a general information about the container, such as who have created it, organization, email to contact, etc
to %labels
section. It's a free form section, there is no restrictions. The general format is a name-value pair.
%labels
CREATOR Maria Litovchenko
ORGANIZATION TEST_WORKING_PLACE
EMAIL
VERSION v0.0.1
%help section contains information which helps a user to interact with the container, i.e. how to run it.
This section is also a free text and usually provides information about main and support software installed in this container as well as gives example execution command.
%help
Main software:
samtools v.1.10 http://www.htslib.org/
Example run:
singularity exec samtools_in_a_box.sif samtools --help
%files section is used to copy files from the host computer (where a container is build) inside the container.
This section can only be used if you build container on your computer. Can not be used during online build on Singularity cloud. The general convention is that each line of %files
is a source and destination pair. Source has to be a valid path on host computer and destination is a path inside container. However, this section significantly reduces reproducibility of a container.
%files
/usr/bin/Desctop/my_test_file /olala
This list of sections is not all inclusive. For the full list, please check with official documentation.
Example above was dedicated to show how containerization of Linux/Ubuntu command line tool is performed. Yet, there is a lot of bioinformatics software which comes as R
or python
packages. More about their containerization in the sections below.
Due to the existence of continuumio/miniconda3
header creating "pure python" containers is actually one of the easiest tasks. continuumio/miniconda3
assures that python3
, conda
and pip
are already present in in container and there is no need to do anything to install them. Quite nice! In comparison the usua python packages installation, there is only one difference: conda
and pip
are not accessible in the root folder, because environmental variables were not set up. So in order to be able just type conda
and pip
as usual an environmental variables need to be set up for them. In the demo recipe below it is done with following command: export PATH=/opt/conda/bin/:$PATH
. It covers both pip
and conda
.
Here is an example of numpy containerized:
Bootstrap: docker
From: continuumio/miniconda3
%post
# STEP 1: environmental variables for conda and pip
export PATH=/opt/conda/bin/:$PATH
# STEP 2: install python version you need (this is optional, but improves reproducibility)
conda install python=3.8
# STEP 3: install python packages with conda.
conda install --channel conda-forge --channel bioconda -y \
numpy=1.21.5
%environment
export PATH=/opt/conda/bin/:$PATH
%help
Main software:
numpy v.1.21.5 https://anaconda.org/anaconda/numpy
Example run:
# create script test.py which loads numpy and displys version
echo 'import numpy' > test.py
echo 'version = numpy.__version__' >> test.py
echo 'print(version)' >> test.py
# run it
singularity exec numpy.sif python3 test.py
%labels
CREATOR Maria Litovchenko
ORGANIZATION TEST_WORKING_PLACE
EMAIL
VERSION v0.0.1
Please note that for the enhanced reproducibility a certain version of numpy (1.21.5) was requested.
Some python packages recommend creation of python environment for installation. With containers it's not necessary. Usually python environments are created to isolate different, sometimes incompatible, packages from each other. If a container will have installed only python packages compatible with each other, there will be no need to isolate them from one another. Yet, if some packages are not compatible, my recommendation is to create two separate containers for them.
The example above is the recommended way to create a container with python packages installed. However, if continuumio/miniconda3
header can not provide with a needed functionality, and one chooses to use ubuntu:20.04
instead, here is how conda
can be installed into container from scratch:
Bootstrap:docker
From:ubuntu:20.04
%post
# STEP 1 [REQUIRED] make installation not require user interaction
export DEBIAN_FRONTEND=noninteractive
apt-get update
apt-get upgrade -y
# STEP 2 [OPTIONAL] install some basic ubuntu/linux libraries
apt-get install -y wget curl unzip zip bzip2 tabix git \
gfortran perl gcc g++ make cmake build-essential \
software-properties-common autoconf
# STEP 3: download miniconda v3.4 (check out https://repo.continuum.io/miniconda/ for other versions)
cd /opt
rm -fr miniconda
wget https://repo.continuum.io/miniconda/Miniconda3-4.7.12-Linux-x86_64.sh -O miniconda.sh
# STEP 4: install conda
bash miniconda.sh -b -p /opt/miniconda
export PATH="/opt/miniconda/bin:$PATH"
# now conda is available and can be used, i.e.
conda install --channel conda-forge --channel bioconda -y numpy=1.21.5
%environment
export PATH="/opt/miniconda/bin:$PATH"
%labels
CREATOR Maria Litovchenko
ORGANIZATION TEST_WORKING_PLACE
EMAIL
VERSION v0.0.1
%help
A test container to show installation of conda on Ubuntu 20.04 from scratch
singularity exec conda_from_scratch.sif conda
For creation of containers with R libraries there is also a trick with header rstudio/r-base:4.0-focal
which simplifies the build due to pre-installed R. In you want another version of R, scroll to section above and choose proper value to give to From:
. In general, to install R
libraries inside the container, the same install.packages
command is used. However, it is prefixed with Rscript -e
(see example below). Also, repos
argument of install.packages
is used to indicated CRAN mirror. An example for bioconductor packages is also given below.
Bootstrap: docker
From: rstudio/r-base:4.0-focal
%post
# installation of data.table package from CRAN
Rscript -e "install.packages(c('data.table'), repos = 'http://cran.us.r-project.org', quietly = TRUE)"
# installation of biocondactor & limma package from bioconductor
Rscript -e "install.packages(c('BiocManager'), repos = 'http://cran.us.r-project.org', quietly = TRUE)"
Rscript -e "BiocManager::install(c('limma'), ask = FALSE, update = FALSE)"
%labels
CREATOR Maria Litovchenko
ORGANIZATION TEST_WORKING_PLACE
EMAIL
VERSION v0.0.1
%help
A test container showing how to install R packages
echo 'library(data.table)' > test.R
echo 'library(limma)' >> test.R
singularity exec limma.sif Rscript --vanilla test.R
The example above is the recommended way to create a container with R
libraries installed. However, if rstudio/r-base:4.0-focal
header can not provide with a needed functionality, and one chooses to use ubuntu:20.04
instead, here is how R
can be installed into container from scratch:
Bootstrap: docker
From: ubuntu:20.04
%post
# STEP 1 [REQUIRED] make installation not require user interaction
export DEBIAN_FRONTEND=noninteractive
apt-get update
apt-get upgrade -y
# STEP 2 [REQUIRED] install some basic ubuntu/linux libraries
apt-get install -y wget curl unzip zip bzip2 tabix git \
gfortran perl gcc g++ make cmake build-essential \
software-properties-common autoconf
# STEP 3 install R
apt-get -qq -y install libncurses5-dev zlib1g-dev libbz2-dev gnupg2 \
libcurl4-openssl-dev libssl-dev libxml2-dev
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/'
apt-get -qq -y dist-upgrade
apt-get -qq -y install r-base r-base-core r-recommended
%labels
CREATOR Maria Litovchenko
ORGANIZATION TEST_WORKING_PLACE
EMAIL
VERSION v0.0.1
%help
A test container showing how to install R from scratch
If you'd like to create a frankenstein Singularity container which will contain bash command line software, R libraries and python packages, the best way to go is to use continuumio/miniconda3
header rather than ubuntu:20.04
. This is because continuumio/miniconda3
will already provide you with Linux and conda
, and R
can be installed via conda
: conda install -c conda-forge r-base
. Also, a lot of command line tools, i.e. bwa
aligner can also be installed thought conda
! Always check out https://anaconda.org/ for availability of your command line tools before you start container design! It can greatly simplify container design and take care of numerous dependencies!
Now, then we have Singularity definition file written, we can finally build our container.
It is implied that singularity is installed on your computer/server and you have root privileges. For MAC OS installation guide, check our this link, and for Ubuntu - this one. If you have a Windows machine, you can build you container in the cloud or install a virtual machine with Ubuntu and Singularity.
Building container on computer is extremely easy! It's essentially a one-liner:
sudo singularity build name_of_the_container.sif definition_file_name.def
In case of the container - example from above for samtools
:
sudo singularity build samtools_in_a_box.sif samtools.def
Once the command is executed, a file samtools_in_a_box.sif
will appear. It is a container fully ready to use!
Singularity provides us with opportunity to build containers in the cloud, which takes away hustle of Singularity installation on personal computer. Please follow this tutorial to build your container in the cloud.
This section implies that you have singularity installed on your computer/server. For MAC OS installation guide, check our this link, and for Ubuntu - this one. If you have a Windows machine, you're on your own.
If the container was build in the cloud, pull it using corresponding command:
singularity pull --arch amd64 library://marialitovchenko/marialitovchenko/samtools_in_a_box:latest
If the container was build offline, you already have it!
Once your container is created/pulled/downloaded, you can interact with it in several modes.
- Checking integrity. As described above a container is something like remote server in the shell of a sif file. If it's like server, we can log in inside it and check installation of the software. To get inside the container, use shell command to tell singularity to let you in:
singularity shell samtools_in_a_box_latest.sif
This will lead to a change in your terminal to:
Singularity>
Congratulations! You are inside the container! Command:
ls -l /
will display all the files and folders inside the container:
total 4615
lrwxrwxrwx 1 root root 7 Apr 26 19:49 bin -> usr/bin
drwxr-xr-x 2 root root 3 Apr 15 2020 boot
drwxr-xr-x 18 root root 3800 May 11 16:51 dev
lrwxrwxrwx 1 root root 36 May 11 16:13 environment -> .singularity.d/env/90-environment.sh
drwxr-xr-x 53 root root 1819 May 11 16:14 etc
drwxr-xr-x 1 vagrant vagrant 60 May 11 16:59 home
lrwxrwxrwx 1 root root 7 Apr 26 19:49 lib -> usr/lib
lrwxrwxrwx 1 root root 9 Apr 26 19:49 lib32 -> usr/lib32
lrwxrwxrwx 1 root root 9 Apr 26 19:49 lib64 -> usr/lib64
lrwxrwxrwx 1 root root 10 Apr 26 19:49 libx32 -> usr/libx32
drwxr-xr-x 2 root root 3 Apr 26 19:49 media
drwxr-xr-x 2 root root 3 Apr 26 19:49 mnt
drwxr-xr-x 2 root root 3 Apr 26 19:49 opt
dr-xr-xr-x 102 root root 0 May 11 16:52 proc
drwx------ 2 root root 64 May 11 16:14 root
drwxr-xr-x 8 root root 136 May 11 16:13 run
drwxrwxr-x 9 root root 2371 May 11 16:15 samtools-1.10
-rw-rw-r-- 1 root root 4721173 Dec 7 22:42 samtools-1.10.tar.bz2
lrwxrwxrwx 1 root root 8 Apr 26 19:49 sbin -> usr/sbin
lrwxrwxrwx 1 root root 24 May 11 16:13 singularity -> .singularity.d/runscript
drwxr-xr-x 2 root root 3 Apr 26 19:49 srv
dr-xr-xr-x 13 root root 0 May 11 16:51 sys
drwxrwxr-x 2 root root 36 May 11 16:15 tiny_scripts
drwxrwxrwt 8 root root 4096 May 11 16:59 tmp
drwxr-xr-x 14 root root 217 May 11 16:14 usr
drwxr-xr-x 11 root root 160 Apr 26 19:52 var
You can even see the folder tiny_scripts
which we created in the recipe above!
ls -l /tiny_scripts/
Now, if just ls -l
command is executed (without the slash!), you'll see files which are located in the same folder as your container (and even container itself):
total 238104
drwxr-xr-x 3 root root 4096 Nov 16 21:31 go
-rw-rw-r-- 1 vagrant vagrant 0 May 11 17:04 samtools.def
-rwxrwxr-x 1 vagrant vagrant 243810304 May 11 16:56 samtools_in_a_box_latest.sif
This shows that container has an access not only to the files which are located inside it, but also to the ones located outside it too.
Now, you can check on software installation and in general interact with the terminal in the same way as you would just as in usual terminal. To check samtools
installation, run:
samtools --help
Result should be:
Program: samtools (Tools for alignments in the SAM format)
Version: 1.10 (using htslib 1.10)
Usage: samtools <command> [options]
Commands:
-- Indexing
dict create a sequence dictionary file
faidx index/extract FASTA
fqidx index/extract FASTQ
index index alignment
...
To log out of the container, type
exit
-
Executing software from container. Singularity makes it very easy to run the software from the container. Essentially, all the commands used in terminal to run the software will stay the same. The only change one needs to make is to put
singularity exec your_container.sif
in front of the command. For example, to executesamtools
which is installed in our test container, the following command is used:
singularity exec samtools_in_a_box_latest.sif samtools --help
Following message should appear on your screen proving that samtools
worked:
Program: samtools (Tools for alignments in the SAM format)
Version: 1.10 (using htslib 1.10)
Usage: samtools <command> [options]
Commands:
-- Indexing
dict create a sequence dictionary file
faidx index/extract FASTA
fqidx index/extract FASTQ
index index alignment
...
Basically, you can put any bash command after singularity exec samtools_in_a_box_latest.sif
and it will be executed inside the container. For example, to list files inside the container:
singularity exec samtools_in_a_box_latest.sif ls -l /
2a. Executing software from container if it is not globally accessible. In the example above samtools
is globally accessible meaning that regardless of the folder inside the container, if a samtools
command is called, samtools
will be executed. However, it can happen that a software you (or someone else) put inside the container was not made globally accessible. In such case, a full path to software inside the container should be given after samtools_in_a_box_latest.sif
. During creation of samtools_in_a_box_latest.sif
a say_cheese.sh
script was created in folder tiny_scripts. In order to execute it:
singularity exec samtools_in_a_box_latest.sif bash /tiny_scripts/say_cheese.sh
Result should be:
CHEESE!
The slash in front of the tiny_scripts/say_cheese.sh
is imperative as it indicates the root directory and that file to be executed is inside the container.
2b. Binding. All commands above did not require files from your computer to serve as an input, which of course will not be the case then you'll use the container for data analysis. There is a small detail in the usage of singularity containers associated with location of your input files. Let's create a small sam file to serve as test one:
wget https://raw.githubusercontent.com/samtools/samtools/develop/examples/toy.sam
Now, let's use our container with samtools
to convert that sam to bam format:
singularity exec samtools_in_a_box_latest.sif samtools view -bS toy.sam > small_test.bam
If you check now your computer with ls -l
, you'll see that small_test.bam
was created. Container worked! Hooray! However, it is unlikely that all of your input files will be located in the same directory as a container. So let's move our toy.sam
to the directory just outside directory which contains our container:
# create the directory
mkdir ../sam_files_dir/
# move
mv toy.sam ../sam_files_dir/
and let's try to execute the same command again:
singularity exec samtools_in_a_box_latest.sif samtools view -bS /sam_files_dir/toy.sam > small_test_1.bam
Error occurred:
FATAL: could not open image /home/vagrant/sam_files_dir/olala/samtools_in_a_box_latest.sif: failed to retrieve path for /home/vagrant/sam_files_dir/olala/samtools_in_a_box_latest.sif: lstat /home/vagrant/sam_files_dir/samtools_in_a_box_latest.sif: no such file or directory.
Despite that a full path to /sam_files_dir/toy.sam
was given! The error message informs that file small_test.sam
does not exist! Of course, in reality it does, just the container doesn't "see" it meaning that it can't detect it. This problem is solved with binding. Singularity allows you to map directories on your host system to directories within your container using bind mounts. Binding is very similar with mounting directories of the remove server to your own computer. Binding is performed via --bind
option given to singularity followed by a full path to the folder on your computer you'd link to bind and full path to the folder in the container you'd like the original folder be bound to separated by ":". For example: singularity --bind path_on_your_computer:path_to_be_used_inside_the_container
. The folder inside the container you'd like your folder on the computer to be bound to does not have to exist. In fact, I find it the most convenient to use exactly the same full path to the folder on computer as the binding path inside the container. This will ensure that all your code which uses absolute paths to the input files will runs smoothly. For example, the container with samtools we've created above is located in containers folder, and small_test.sam is located in sam_files_dir. Both containers folder and sam_files_dir are subfolders of singularity_showcase folder. Let's use it for binding. I general, it's a good idea to use as bind path a directory which is parent to both directory containing your container (sif file) and your input files.
singularity —bind /singularity_tutorial/software/:/singularity_tutorial/software/ exec samtools_in_a_box_latest.sif samtools view -bS /singularity_showcase/sam_files_dir/small_test.sam > small_test_1.bam
Worked!
- Get your file
- Put the file into your dropbox
- Get a link from your file from dropbox (‘copy link’), it will have a format something like this:
https://www.dropbox.com/s/vzhwjgbbmdnq135/12864_2020_6583_MOESM2_ESM.csv?dl=0
- delete ‘?dl=0’ from the end of the link
- insert following code into your script:
wget https://www.dropbox.com/s/vzhwjgbbmdnq135/12864_2020_6583_MOESM2_ESM.csv
How to see recipe of already build container? singularity run-help my_container.sif External files to download into package: dropbox, google drive Library of recipies
https://sylabs.io/guides/3.5/user-guide/definition_files.html