Containers
- Singularity 3.x Usage on HPC Clusters
- Using docker and singularity images from existing container libraries
- Using Singularity
- Slurm Batch Jobs with Containers
- Building Containers
- Things to Keep in Mind
- Workshop Materials
Singularity 3.x Usage on HPC Clusters
Singularity is available on HPC clusters at Iowa State and allows users to create their own environment and let the users develop and customize their workflow without the need for admin intervention.
As always, email hpc-help@iastate.edu if you have questions regarding the use of Singularity.
Using docker and singularity images from existing container libraries
List of useful container libraries
1. Docker Based Container Libraries
Docker Hub: https://hub.docker.com/
Nvidia GPU-Accelerated Containers (NGC): https://ngc.nvidia.com/
Quay (Bioinformatics): https://quay.io/ or https://biocontainers.pro/#/registry
2. Singularity Container Library
Singularity Library: https://cloud.sylabs.io/library
Using Singularity
Visit any one of the above listed libraries and search for the container image you need.
After gaining interactive access to one of the compute nodes and loading the module, we first pull (download) the pre-built image
module load singularity # loads the latest singularity module v3.1.1 as of 09-23-2019
singularity pull docker://gcc # pulls an image from docker hub
This pulls the latest GCC container (gcc v 9.1.0 as of 08/19/2019) and saves the image in your current working directory.
If you prefer to pull a specific GCC version, look at the available tags for the specific container and append the tag version to the end of the container name. For example, if you need to pull the GCC v 5.3.0
singularity pull docker://gcc:5.3.0
Note: You can pull the images to a directory of your choosing (assuming you have write permission) by setting the variables SINGULARITY_CACHEDIR and SINGULARITY_TMPDIR. For instance,
export SINGULARITY_CACHEDIR=$TMPDIR
export SINGULARITY_TMPDIR=$TMPDIR
Note on home directory and Singularity
While pulling the containers, pay attention to the home directory as the cached image blobs will be saved in ${HOME}/.singularity
Since the home directory has a limited amount of space, this can fill up quite easily. Users can change where the files will be cached by setting SINGULARITY_CACHEDIR and SINGULARITY_TMPDIR environment variables.
To use the executables within the container
$ singularity exec gcc.img gcc --version
gcc (GCC) 9.1.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
You can use this to compile your code or run your scripts using the container’s executables.
$ singularity exec gcc.img gcc hello_world.c -o hello_world
$ singularity exec gcc.img ./hello_world
Hello, World!
/home/${USER}, /work, /ptmp and ${TMPDIR} are accessible via the container image.
$ pwd
/home/ynanyam
$ ls
catkin_ws cuda env_before luarocks pycuda_test.py
$ singularity exec gcc.img pwd
/home/ynanyam
$ singularity exec gcc.img ls
catkin_ws cuda env_before luarocks pycuda_test.py
Interactive access
To gain interactive access to the container
$ singularity shell gcc.img
Singularity gcc.img:~> gcc --version
gcc (GCC) 9.1.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
To access GPUs within a container, use “–nv” option
singualrity exec --nv <options>
singularity shell --nv <options>
Slurm batch jobs with containers
Here is a example slurm batch script that downloads a container and uses it to run a program.
#!/bin/bash
#SBATCH -N1
#SBATCH -n20
#SBATCH -t120
unset XDG_RUNTIME_DIR
cd $TMPDIR
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/Bactrocera_oleae/protein/protein.fa.gz
gunzip protein.fa.gz
SINGULARITY_TMPDIR=$TMPDIR SINGULARITY_CACHEDIR=$TMPDIR singularity build clustalo.sif docker://quay.io/biocontainers/clustalo:1.2.4--1
singularity exec clustalo.sif clustalo -i protein.fa -o result.fasta --threads=${SLURM_NPROCS} -v
Singularity and MPI
Reference: https://sylabs.io/guides/3.3/user-guide/mpi.html
Singularity supports running MPI applications via a container. In order to use singularity containers with MPI, the MPI application must be made available on both the host and within the container.
This requires the host and container MPI versions to be compatible. One way to make sure the versions match is to use the MPI executables and libraries available on the host and bind it to the containers.
Below is a simple MPI batch script that uses OpenMPI
#!/bin/bash
#SBATCH -N1
#SBATCH -n4
#SBATCH -t20
#SBATCH -e slurm-%j.err
#SBATCH -o slurm-%j.out
unset XDG_RUNTIME_DIR
cd $TMPDIR
SINGULARITY_TMPDIR=$TMPDIR SINGULARITY_CACHEDIR=$TMPDIR singularity build centos-openmpi.sif docker://centos
export SINGULARITYENV_PREPEND_PATH=/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c/bin
export SINGULARITYENV_LD_LIBRARY_PATH=/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c/lib
wget https://raw.githubusercontent.com/wesleykendall/mpitutorial/gh-pages/tutorials/mpi-hello-world/code/mpi_hello_world.c
module load openmpi
mpicc mpi_hello_world.c -o hello_world
mpirun -np ${SLURM_NPROCS} singularity exec --bind /opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c centos-openmpi.sif ./hello_world
In the above example script - openmpi is made available with the container using
export SINGULARITYENV_PREPEND_PATH=/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c/bin
export SINGULARITYENV_LD_LIBRARY_PATH=/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c/lib
and then bind mounted to the container. (By default we don’t bind mount the software directories)
In most cases, the containers we use don’t include the Infiniband drivers, OpenMPI fails through and then falls back to the ethernet interface.
If users are building their own containers which make use of MPI be sure to include the OFED IB driver stack.
Singularity and Nvidia Container Library (Optional)
Nvidia provides a container library that works with both Docker and Singularity.
UPDATE(05/11/2020):
NCG Container library is now public. No account and API Key required to access the library.
Users must have an account created in NGC and create an API key (save your key in a secure location!!!). Instructions to get started are here: https://docs.nvidia.com/ngc/ngc-user-guide/singularity.html#singularity
Once you have an account created, export the required environment variables and download the containers for the library. https://ngc.nvidia.com/catalog/containers
NOTE:
Do not set the environment variables SINGULARITY_DOCKER_USERNAME and SINGULARITY_DOCKER_PASSWORD in your .bashrc file as it prevents pulling containers from libraries other than NGC https://groups.google.com/a/lbl.gov/forum/#!topic/singularity/9q1aTycZ6CA
Below is an example script that downloads namd container and then runs an example on GPU node. https://ngc.nvidia.com/catalog/containers/hpc:namd
#!/bin/bash
#SBATCH -N1
#SBATCH -n36
#SBATCH -t10
#SBATCH -pgpu
unset XDG_RUNTIME_DIR
cd $TMPDIR
export SINGULARITY_DOCKER_USERNAME='$oauthtoken'
export SINGULARITY_DOCKER_PASSWORD=<API Key>
wget http://www.ks.uiuc.edu/Research/namd/utilities/stmv.tar.gz
tar xf stmv.tar.gz
curl -s https://www.ks.uiuc.edu/Research/namd/2.13/benchmarks/stmv_nve_cuda.namd > stmv/stmv_nve_cuda.namd # constant energy
curl -s https://www.ks.uiuc.edu/Research/namd/2.13/benchmarks/stmv_npt_cuda.namd > stmv/stmv_npt_cuda.namd # constant pressure
nvidia-cuda-mps-server # loads nvidia modules to work with singularity GPU containers
SINGULARITY_TMPDIR=$TMPDIR SINGULARITY_CACHEDIR=$TMPDIR singularity pull docker://nvcr.io/hpc/namd:2.13-singlenode
singularity exec --nv --bind $PWD:/host_pwd namd_2.13-singlenode.sif namd2 +ppn ${SLURM_NPROCS} +setcpuaffinity +idlepoll +devices 0,1 /host_pwd/stmv/stmv_nve_cuda.namd # stmv constant energy benchmark
singularity exec --nv --bind $PWD:/host_pwd namd_2.13-singlenode.sif namd2 +ppn ${SLURM_NPROCS} +setcpuaffinity +idlepoll +devices 0,1 /host_pwd/stmv/stmv_npt_cuda.namd # stmv constant pressure benchmark
Building Containers
Since docker images are compatible with Singularity, users can write recipes for their containers either as a dockerfile or singularity definition file.
Be aware that regular users do not have privileges to build container images from recipes on the cluster. They either need privileged access to a machine with docker or singularity installed or use docker hub/singularity library to push their recipes and build container images - we recommend the latter.
Docker
Below are the links to get started on building docker containers -
Dockerfile reference - https://docs.docker.com/engine/reference/builder/
Upload your dockerfiles to docker hub - https://docs.docker.com/docker-hub/
Singularity
Create an account at Sylabs to build a container using a definition. https://cloud.sylabs.io/builder
Once logged in we are presented with a text box to start writing the definition.
Detailed documentation for a Singularity definition file is here. But below is a simple example to get you started.
Bootstrap: docker
From: continuumio/miniconda3
%labels
maintainer "Name" <email address>
%post
apt-get update && apt-get install -y git
# Conda install stringtie
conda install -c bioconda stringtie
The header includes the Bootstrap and the label of the container.
Most of the definition files use docker as their bootstrap as docker library is more robust and well maintained.
This is using an existing docker image for miniconda (Python 3) available from https://hub.docker.com/r/continuumio/miniconda3
Post section is any modifications or additions the user can make to the original container - in this case we are adding stringtie package to the container.
Things to keep in mind
- By default - work, home, ${TMPDIR} and ptmp are bind mounted to the container
- User outside the container = User inside the container (This implies that the permissions within the container are the same as on the bare-metal compute node)
- All the networking stack is available from within the container - if the container has Infiniband stack installed it will make use of the network
Having
unset XDG_RUNTIME_DIR
in your slurm script is useful when you have jupyter notebook in the container. Also removes some annoying warnings from your logs
- In the examples above, everything was done within ${TMPDIR} which will be deleted at the end of the job. Make sure you copy the output to your project directory to retain your work
Make sure you issue
nvidia-cuda-mps-server
when using GPU nodes as this loads all the required modules to make them work with Singularity
Workshop Materials
The following materials from the workshop on october 14, 2021 are available for download:
- For slides Click here
- For the list of commands Click here