Overview
A container is a software bundle for an application that can run on different operating systems without changing the code. This guide explains how to use and create Apptainer containers, which is the preferred type of container for HPC clusters.
- Apptainer Usage on HPC Clusters
- Using docker and apptainer images from existing container libraries
- List of useful container libraries
- Using Apptainer
- Slurm Batch Jobs with Containers
- Building Containers
- Things to Keep in Mind
- Workshop Materials
The Apptainer Container Platform and HPC
While there are many different systems for implementing containers, the Apptainer container platform (formerly known as Singularity) is most commonly used on HPC clusters. Apptainer containers make it easy to use compute hardware such as GPUs, plus Apptainer containers are better integrated with the cluster provided MPI libraries and job schedulers such as Slurm.
As always, email researchit@iastate.edu if you have questions regarding the use of Apptainer.
Using docker and apptainer images from existing container libraries
Apptainer can use containers from Docker, as well as its own Apptainer format.
List of useful container libraries
1. Docker Based Container Libraries
Docker Hub: https://hub.docker.com/
Nvidia GPU-Accelerated Containers (NGC): https://ngc.nvidia.com/
Quay (Bioinformatics): https://quay.io/ or https://biocontainers.pro/#/registry
Using Apptainer
Visit any one of the above listed libraries and search for the container image you need.
After gaining interactive access to one of the compute nodes and loading the module, we first pull (download) the pre-built image
$ module load apptainer # loads the latest apptainer module
$ apptainer --version # checks the apptainer version
apptainer version 1.3.6
$ apptainer pull docker://gcc # pulls an image from docker hub
This pulls the latest GCC container (gcc v 9.1.0 as of 08/19/2019) and saves the image in your current working directory.
If you prefer to pull a specific GCC version, look at the available tags for the specific container and append the tag version to the end of the container name. For example, if you need to pull the GCC v 5.3.0
$ apptainer pull docker://gcc:5.3.0
Note: You can pull the images to a directory of your choosing (assuming you have write permission) by setting the variables APPTAINER_CACHEDIR and APPTAINER_TMPDIR. For instance,
$ export APPTAINER_CACHEDIR=$TMPDIR
$ export APPTAINER_TMPDIR=$TMPDIR
Note on home directory and Apptainer
While pulling the containers, pay attention to the home directory as the cached image blobs will be saved in ${HOME}/.apptainer
Since the home directory has a limited amount of space, this can fill up quite easily. Users can change where the files will be cached by setting APPTAINER_CACHEDIR and APPTAINER_TMPDIR environment variables.
To use the executables within the container
$ apptainer exec gcc.img gcc --version
gcc (GCC) 9.1.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
You can use this to compile your code or run your scripts using the container’s executables.
$ apptainer exec gcc.img gcc hello_world.c -o hello_world
$ apptainer exec gcc.img ./hello_world
Hello, World!
/home/${USER}, /work, /ptmp and ${TMPDIR} are accessible via the container image.
$ pwd
/home/ynanyam
$ ls
catkin_ws cuda env_before luarocks pycuda_test.py
$ apptainer exec gcc.img pwd
/home/ynanyam
$ apptainer exec gcc.img ls
catkin_ws cuda env_before luarocks pycuda_test.py
Interactive Access to a Container
When testing a container, it is useful to have Linux shell access to the container. The following uses the apptainer command to open on shell on the gcc.img container image. Note: This should be done from a compute node, not the head node.
$ apptainer shell gcc.img
gcc.img:~> gcc --version
gcc (GCC) 9.1.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
To access GPUs within a container, use “–nv” option
apptainer exec --nv <options>
apptainer shell --nv <options>
Slurm batch jobs with containers
Here is a example slurm batch script that downloads a container and uses it to run a program.
#!/bin/bash
#SBATCH -N1
#SBATCH -n20
#SBATCH -t120
unset XDG_RUNTIME_DIR
cd $TMPDIR
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/Bactrocera_oleae/protein/protein.fa.gz
gunzip protein.fa.gz
module load apptainer
apptainer build clustalo.sif docker://quay.io/biocontainers/clustalo:1.2.4--1
apptainer exec clustalo.sif clustalo -i protein.fa -o result.fasta --threads=${SLURM_NPROCS} -v
Apptainer and Nvidia Container Library (Optional)
Nvidia provides a container library that works with both Docker and Apptainer.
UPDATE(05/11/2020):
NCG Container library is now public. No account and API Key required to access the library.
Users must have an account created in NGC and create an API key (save your key in a secure location!!!). Instructions to get started are here: https://docs.nvidia.com/ngc/ngc-user-guide/singularity.html#singularity
Once you have an account created, export the required environment variables and download the containers for the library. https://ngc.nvidia.com/catalog/containers
NOTE:
Do not set the environment variables APPTAINER_DOCKER_USERNAME and APPTAINER_DOCKER_PASSWORD in your .bashrc file as it prevents pulling containers from libraries other than NGC https://groups.google.com/a/lbl.gov/forum/#!topic/singularity/9q1aTycZ6CA
Below is an example script that downloads namd container and then runs an example on GPU node. https://ngc.nvidia.com/catalog/containers/hpc:namd
#!/bin/bash
#SBATCH -N1
#SBATCH -n36
#SBATCH -t10
#SBATCH -pgpu
unset XDG_RUNTIME_DIR
cd $TMPDIR
wget http://www.ks.uiuc.edu/Research/namd/utilities/stmv.tar.gz
tar xf stmv.tar.gz
curl -s https://www.ks.uiuc.edu/Research/namd/2.13/benchmarks/stmv_nve_cuda.namd > stmv/stmv_nve_cuda.namd # constant energy
curl -s https://www.ks.uiuc.edu/Research/namd/2.13/benchmarks/stmv_npt_cuda.namd > stmv/stmv_npt_cuda.namd # constant pressure
apptainer pull docker://nvcr.io/hpc/namd:2.13-singlenode
apptainer exec --nv --bind $PWD:/host_pwd namd_2.13-singlenode.sif namd2 +ppn ${SLURM_NPROCS} +setcpuaffinity +idlepoll +devices 0,1 /host_pwd/stmv/stmv_nve_cuda.namd # stmv constant energy benchmark
apptainer exec --nv --bind $PWD:/host_pwd namd_2.13-singlenode.sif namd2 +ppn ${SLURM_NPROCS} +setcpuaffinity +idlepoll +devices 0,1 /host_pwd/stmv/stmv_npt_cuda.namd # stmv constant pressure benchmark
Building Containers
Since docker images are compatible with Apptainer, users can write recipes for their containers either as a dockerfile or apptainer definition file.
Most containers can be built without requiring admin privilege - users can build their own containers using the definition file on the cluster compute nodes.
Docker
Below are the links to get started on building docker containers -
Dockerfile reference - https://docs.docker.com/engine/reference/builder/
Upload your dockerfiles to docker hub - https://docs.docker.com/docker-hub/
Apptainer
The easiest way to create apptainer containers is via the definition file.
Detailed documentation for a Singularity definition file is here. But below is a simple example to get you started.
Bootstrap: docker
From: condaforge/mambaforge
%labels
maintainer "Name" <email address>
%post
apt-get update && apt-get install -y git
# Conda install stringtie
conda install -c bioconda stringtie
The header includes the Bootstrap and the label of the container.
Most of the definition files use docker as their bootstrap as docker library is more robust and well maintained.
This is using an existing docker image for mambaforge (Python 3) available from https://hub.docker.com/r/condaforge/mambaforge
Post section is any modifications or additions the user can make to the original container - in this case we are adding stringtie package to the container.
Things to keep in mind
- By default - work, home, ${TMPDIR} and ptmp are bind mounted to the container
- User outside the container = User inside the container (This implies that the permissions within the container are the same as on the bare-metal compute node)
- All the networking stack is available from within the container - if the container has Infiniband stack installed it will make use of the network
Having
unset XDG_RUNTIME_DIR
in your slurm script is useful when you have jupyter notebook in the container. Also removes some annoying warnings from your logs
In the examples above, everything was done within ${TMPDIR} which will be deleted at the end of the job. Make sure you copy the output to your project directory to retain your work
Workshop Materials
The following materials from the workshop on october 14, 2021 are available for download:
- For slides Click here
- For the list of commands Click here