Singularity
Singularity/Apptainer
Singularity /Apptainer (https://apptainer.org/ )simplifies the creation and execution of containers, ensuring software components are encapsulated for portability and reproducibility.
Important Notes
(Very important) The new features of singularity may not have been reflected in this document, please refer to SYLABS. Singularity FAQs are helpful.
The container images in HPC are mostly located at /usr/local/singularity/containers and definition files are located at /usr/local/singularity/definitions.
The newer of version of Singularity may have different commands - see the Singularity documentation [1]
To modify the container, You NEED TO CREATE an image in YOUR PC as an admin/root using Singularity and transfer it to your home directory at HPC and run it using the Singularity installed in HPC
For running in multiple nodes, "The host MPI and container MPI need to be compatible. Us saying "the same version is an easy way to cut off any possible issues and is usually pretty easy to accomplish".
As soon as you are done, you can delete your container from your home directory to save space if you are using your own container.
If you are making use of singularity, set the environment variable for Singularity cache to /scratch or other location and delete that cache directory to avoid GBs of usage.
export SINGULARITY_CACHEDIR=/mnt/pan/courses/<course-directory>/<caseID>/singularity
The system defined bound for Singularity container are $HOME, /tmp. /proc, /sys, /dev etc. The binding feature of singularity with -B flag can be used for /mnt and /scratch as showed as an example below. Also, it is recommended to use /scratch space instead of /tmp.
singularity exec -B /mnt,/scratch --nv $TENSORFLOW python train.py
Running Singularity in HPC
Request a compute node
srun --pty bash
Load the Singularity module:
module load singularity
Run singularity to see the options
singularity
Output:
USAGE: singularity [global options...] <command> [command options...] ...
...
View the Container command usage
singularity exec --help
Output:
USAGE: singularity [...] exec [exec options...] <container path> <command>
EXEC OPTIONS:
-B/--bind <spec> A user-bind path specification. spec has the format
...
Pull Application Container Images from Singularity and Docker
singularity pull shub://<singularity-container-full-path> #Eg. Singularity path: shub://ucr-singularity/cuda-10.1-base:latest
singularity pull docker://<docker-container-full-path> #Eg. docker path: docker://samuelcolvin/tensorflow-gpu-py3
Run the containerized application:
singularity exec -B /mnt --nv ./<app-image>.sif <executable> ... #Example: singularity exec -B /mnt --nv ./cuda-10.1-base_latest.sif nvcc -V
Detailed Information
Bind Paths in Singularity
Singularity allows you to map directories on your host system to directories within your container using bind mounts. This allows you to read and write data on the host system with ease [20]. Many of the Singularity commands such as run, exec, and shell take the --bind/-B command-line option to specify bind paths, in addition to the SINGULARITY_BINDPATH environment variable.
System defined bound points are but not limited to$HOME, /tmp, /proc, /sys, and /dev etc. To map to /scratch and /mnt using exec, it becomes:
singularity exec -B /mnt,/scratch --nv $TENSORFLOW python <python-script>
Else, you will get the error like this though the file is in the /mnt/<path-to-train.py):
python: can't open file './train.py': [Errno 2] No such file or directory
Container support at Run Time
After the release of Singularity-2.3, it is no longer necessary to install NVIDIA drivers into your Singularity container to access the GPU on a host node [10]. Now, you can simply use the --nv option to grant your containers GPU support at runtime.
See the -nv option:
singularity exec --help
Output:
USAGE: singularity [...] exec [exec options...] <container path> <command>
...
-n/--nv Enable experimental Nvidia support
...
Check the Singularity containers available in HPC who environment variables are defined in Singularity module:
module display singularity
Output:
...
prepend_path("TENSORFLOW","/usr/local/tensorflow/2017/tensorflow.img")
prepend_path("FASTAI","/usr/local/fastai/1.0.50/tensorflow-gpu-py36.simg")
prepend_path("OPENSEES","/usr/local/singularity/containers/opensees/opensees.simg")
prepend_path("OPENSEES25","/usr/local/singularity/containers/opensees/opensees-2.5.simg")
...
So, Tensorflow, FastAI, OpenSees, RAPIDS etc. are installed as Singularity containers
Check this guide to try running some of the containerized applications using singularity.
Running your own Container in HPC
Use Container At Run time
There is a way you can run the container at run time without creating or building it.
singularity exec --nv docker://tensorflow/tensorflow:latest-gpu-py3 python3 ./models/tutorials/image/mnist/convolutional.py
The models directory in the above command can be downloaded in your home directory
git clone https://github.com/tensorflow/models.git
Build you own container in HPC
You can pull the container from shub or docker in HPC as a user
singularity pull shub://ucr-singularity/cuda-9.0-base:latest
It creates the container "ucr-singularity-cuda-9.0-base-master-latest.simg"
Or, using docker
singularity pull docker://samuelcolvin/tensorflow-gpu-py36
You can build your image (e.g. cuda.img) from shub or docker in HPC as a user
singularity build cuda.img shub://ucr-singularity/cuda-9.0-base:latest
MPI Container
Case Study: Gromacs-Plumed
Though you may get the error like "Error :[gput027:28538] [[18950,0],0] ORTE_ERROR_LOG: Not found in file plm_slurm_module.c at line 450" using multiple nodes, you should be able to run MPI Gromacs-Plumed requesting multiple processors in a single node.
Pull the image from Docker:
singularity pull docker://rinnocente/gromed-ts-revised-2019:latest
Run the job by using job script template below
#SBATCH -N 1 -n <# of CPUs> --mem=<memory-size>gb
cp -r <topology-and-data-files> $PFSDIR
cd $PFSDIR
singularity exec -B /scratch ~/<path-to-image>/gromed-ts-revised-2019_latest.sif mpiexec -np <# of CPUs> /usr/local/gromacs/bin/gmx_mpi mdrun -nb auto -s <TPR file> -nsteps <steps> -plumed <plumed-input-data>
Building from a scratch in your PC & Transferring it to HPC
Installing Singularity in your PC - https://sylabs.io/guides/3.6/admin-guide/admin_quickstart.html#installation-from-source
yum update -y && \
yum groupinstall -y 'Development Tools' && \
yum install -y \
openssl-devel \
libuuid-devel \
libseccomp-devel \
wget \
squashfs-tools
wget https://dl.google.com/go/go1.12.5.linux-amd64.tar.gz
tar xzvf go1.12.5.linux-amd64.tar.gz
export GOPATH=/usr/local/src/singularity/go/go
export PATH=/usr/local/src/singularity/go/go/bin:$PATH
export VERSION=3.2.0
wget https://github.com/sylabs/singularity/releases/download/v${VERSION}/singularity-${VERSION}.tar.gz
tar -xzf singularity-${VERSION}.tar.gz cd ./singularity
./mconfig --prefix=/usr/local/singularity/3.2.0
make -C ./builddir
sudo make -C ./builddir install
Very Important: You need to have admin access in your PC. You can't install and manipulate your image in HPC but you can just run it.
You can either build the sandbox directory directly from docker container
sudo /usr/local/singularity/3.5.1/bin/singularity build --sandbox /mnt/vstor/RCCIDATA/singularity/rapidsai/ docker://rapidsai/rapidsai
Or, you can convert existing image to sandbox
sudo /usr/local/singularity/3.5.1/bin/singularity build --sandbox niftynet niftynet.img
Shell to the sandbox directory (e.g. niftynet) and install the required (dependency) packages for the container
sudo /usr/local/singularity/3.5.1/bin/singularity shell -w niftynet/
If you are using any other mount points, use -B option e.g. -B /mnt.
Install the Required Packages and exit from the shell (e.g. install tensorflow-gpu package)
pip install tensorflow-gpu==1.12.2
exit
Create the image as a root from the sandbox
sudo /usr/local/singularity/2.5.1/bin/singularity build niftynet.img niftynet
Building from Singularity Definition file
You can create your won singularity definition file or find one in GITHUB and build a container out of it. You need to be root. Check the def file as a reference in /usr/local/singularity/definitions/<application>/<application-def-file>.def.
sudo /usr/local/singularity/2.5.1/bin/singularity build opensees.img /usr/local/singularity/definitions/opensees/opensees.def
Troubleshooting:
If you get Setting Locale failed, you need to update and install language-pac-en-base
apt-get update
apt-get install language-pack-en-base
Issue with pulling container?
singularity cache clean
References:
[1] (i) Singularity Home (ii) SYLABS
[3] Creating Singularity Image
[4] Bootstrapping Singularity Image
[7] Singularity in Slurm Cluster
[9] Tensorflow - MNIST Tutorial
[10] HPC @ NIH
[11] OpenSees def file
[12] Images in Docker Container
[13] Anaconda/Keras: https://github.com/Drunkar/dockerfiles/tree/master/anaconda-tensorflow-gpu-keras
[14] All-in-One Docker Image: https://github.com/floydhub/dl-docker
[15] Singularity Docker: http://singularity.lbl.gov/docs-docker
[16] All Available Docker Images for Tensorflow: https://hub.docker.com/r/tensorflow/tensorflow/tags/
[17] Neural Machine Translation (NMT)
[18] Singularity Hub (shub) - https://singularity-hub.org/
[19] Docker Hub (docker) - https://hub.docker.com/