Singularity containers - mackeylab/home GitHub Wiki
Singularity containers are basically packaged-up pieces of software that are portable and reproducible.
This allows you to run an analysis or pre-processing pipeline the same way anywhere, without worrying about software installations, $PATH
changes, or environment issues! Hence the term container. Singularity documentation has quite a long exposé on why containers are the best thing since sliced bread, but suffice it to say that I think they are the way of the future for reproducible analyses.
On high-performance computing clusters, Singularity is used to work with containers. First make sure you have it in your $PATH
, so that when you call singularity
, you get something other than command not found
.
SINGULARITY_PATH=/share/apps/singularity/2.5.1/bin
PATH=${SINGULARITY_PATH}:${PATH}
Note: After Singularity version 3.0 (on the CUBIC cluster, for example), the .simg
extension for containers was replaced with .sif
. Call your container as <my-container>.sif
instead of <my-container>.simg
.
First, you have to get a container. You can get a container from Docker Hub (Singularity containers are similar to Docker containers, if you've heard of those), with the command below.
singularity build <name-of-container-and-version>.sif docker://<dockerhub-location>:<version>
Here's an example, pulling the heudiconv
container:
singularity build /my-location/heudiconv-0.5.4.sif docker://nipy/heudiconv:0.5.4
On CUBIC, you need to set SINGULARITY_TMPDIR
and SINGULARITY_CACHEDIR
before building, add these two lines to your .bashrc
:
export SINGULARITY_TMPDIR=$SBIA_TMPDIR
export SINGULARITY_CACHEDIR=$SBIA_TMPDIR
When you use a container, you need to bind, or connect, the folders on your local filesystem (or the cluster filesystem) to the container. This allows processes that are run inside the container to access the data/folders that you want to run those processes on.
The location you're binding to inside the container is called the bind path or the mount point. You bind locations with the -B
flag, using -B <local-folder>:<container-folder>
. Usually the /mnt
and /scratch
bind paths already exist inside the container, and $HOME
is bound by default. For more on this, see their docs.
To run a container, use this command. You can usually run a container with no arguments for help, or using singularity shell
(see below).
singularity run --cleanenv -B <local-folder>:<container-folder> <my-container>.simg <more-arguments-to-pass-along-to-the-container>
Here's an example, running the BIDS-validator container.
singularity run --cleanenv -B /data/jux/mackey_group/public_data/ABCD:/mnt /data/picsl/mackey_group/tools/singularity/bids-validator-v1.4.0.simg /mnt/bids_release2_site14site20/
This runs the bids-validator-v1.4.0.simg
container on a folder called bids_release2_site14site20
, which exists at /data/jux/mackey_group/public_data/ABCD/bids_release2_site14site20/
. Notice I bound the folder above bids_release2_site14site20
to /mnt
in the container.
There is a Freesurfer container in the main CBPD data directory at /cbica/projects/cbpd_main_data/tools/singularity/freesurfer-6.0.0.sif
. I highly recommend using this container to run Freesurfer 6.0.0, as this should take care of any version dependencies that differ on CUBIC.
There is a script in /cbica/projects/cbpd_main_data/tools/singularity
called script_to_run_freesurfer
that should allow you to run Freesurfer on the input data of your choice using this container. To make things even easier, add the line below to your ~/.bashrc
file. Then, you can run Freesurfer commands of your choice (including recon-all
) using this container, just by using the command freesurfer_container
on the command line before your command.
alias freesurfer_container="singularity run --cleanenv -B /cbica/projects/cbpd_main_data/license.txt:/opt/freesurfer/license.txt,/cbica/software/external/matlab/mcr/v711/:/opt/freesurfer/MCRv80 /cbica/projects/cbpd_main_data/tools/singularity/freesurfer-6.0.0.sif"
For really large containers, you may need to build them using a qlogin
session with more memory, if building them throws an error.
You will find that your home directory might fill up with temporary files from building containers. After a container is built and ready, you can delete those. You can also try setting SINGULARITY_CACHEDIR
to put them somewhere else, as outlined here.
You can use singularity shell -B <local-folder>:<container-bind-point> <my-container>.simg
to just open up a shell inside the container and look around. This can be useful in troubleshooting the command you're trying to run or your binding of folders outside the container to inside the container.
Singularity documentation are also quite extensive, if you run into trouble with these instructions, search their site. fMRIprep docs also have lots of tips on troubleshooting Singularity containers on computing clusters.