Singularity /Apptainer (https://apptainer.org/ ) simplifies the creation and execution of containers, ensuring software components are encapsulated for portability and reproducibility.
(Very important) The new features of singularity may not have been reflected in this document, please refer to SYLABS. Singularity FAQs are helpful.
The newer of version of Singularity may have different commands - see the Singularity documentation [1]
To modify the container, You NEED TO CREATE an image in YOUR PC as an admin/root using Singularity and transfer it to your home directory at HPC and run it using the Singularity installed in HPC
To run in multiple nodes, "The host MPI and container MPI need to be compatible. Us saying "the same version is an easy way to cut off any possible issues and is usually pretty easy to accomplish".
As soon as you are done, you can delete your container from your home directory to save space if you are using your own container.
If you are making use of singularity, set the environment variable for Singularity cache to /scratch or other location and delete that cache directory to avoid GBs of usage.
export SINGULARITY_CACHEDIR=/mnt/pan/courses/<course-directory>/<caseID>/singularity # This is directing to course space
The default 1GB of memory may not be sufficient for pulling some of the images and you may encounter something like "create command failed: signal:killed", please assign sufficient memory using "--mem" slurm flag.
The system defined bound for Singularity container are $HOME, /tmp. /proc, /sys, /dev etc. The binding feature of singularity with -B flag can be used for /mnt and /scratch as showed as an example below. Also, it is recommended to use /scratch space instead of /tmp.
singularity exec -B /mnt,/scratch --nv $TENSORFLOW python train.py
Request a compute node
srun --pty bash
Load the Singularity module:
module load singularity
Run singularity to see the options
singularity
Output:
USAGE: singularity [global options...] <command> [command options...] ...
...
View the Container command usage
singularity exec --help
Output:
USAGE: singularity [...] exec [exec options...] <container path> <command>
EXEC OPTIONS:
-B/--bind <spec> A user-bind path specification. spec has the format
...
Pull Application Container Images from Singularity and Docker
singularity pull shub://<singularity-container-full-path> #e.g. Singularity path: shub://ucr-singularity/cuda-10.1-base:latest
singularity pull docker://<docker-container-full-path> #e.g. docker path: docker://samuelcolvin/tensorflow-gpu-py3
Run the containerized application:
singularity exec -B /mnt --nv ./<app-image>.sif <executable> ... #Example: singularity exec -B /mnt --nv ./cuda-12.6-base_latest.sif nvcc -V
Detailed Information
Singularity allows you to map directories on your host system to directories within your container using bind mounts. This allows you to read and write data on the host system with ease [20]. Many of the Singularity commands such as run, exec, and shell take the --bind/-B command-line option to specify bind paths, in addition to the SINGULARITY_BINDPATH environment variable.
System defined bound points are but not limited to$HOME, /tmp, /proc, /sys, and /dev etc. To map to /scratch and /mnt using exec, it becomes:
singularity exec -B /mnt,/scratch --nv $TENSORFLOW python <python-script>
Else, you will get the error like this though the file is in the /mnt/<path-to-train.py):
python: can't open file './train.py': [Errno 2] No such file or directory
After the release of Singularity-2.3, it is no longer necessary to install NVIDIA drivers into your Singularity container to access the GPU on a host node [10]. Now, you can simply use the --nv option to grant your containers GPU support at runtime.
See the -nv option:
singularity exec --help
Output:
USAGE: singularity [...] exec [exec options...] <container path> <command>
...
-n/--nv Enable experimental Nvidia support
...
Check the Singularity containers available in HPC who environment variables are defined in Singularity module:
module display singularity
Output:
...
prepend_path("PARAVIEW","/usr/local/software/singularity/containers/paraview/paraview-index_5.7.0-egl-pvw.sif")
prepend_path("VPT","/usr/local/software/singularity/containers/vpt/vpt_latest.sif")
prepend_path("ELBENCHO","/usr/local/software/singularity/containers/elbencho/elbencho_latest.sif")
prepend_path("TENSORFLOW","/usr/local/software/singularity/containers/niftynet/tensorflow-gpu-py36_latest.sif")
prepend_path("OPENWEBUI","/usr/local/software/singularity/containers/openwebui/openwebui-ollama.sif")
...
So, Paraview, VPT, Elbencho, Tensorflow and Llama are installed as Singularity containers
There is a way you can run the container at run time without creating or building it.
singularity exec --nv docker://tensorflow/tensorflow:latest-gpu-py3 python3 ./models/tutorials/image/mnist/convolutional.py
The models directory in the above command can be downloaded in your home directory
git clone https://github.com/tensorflow/models.git
You can pull the container from shub or docker in HPC as a user
singularity pull shub://ucr-singularity/cuda-12.0-base:latest
It creates the container "ucr-singularity-cuda-12.0-base-master-latest.simg"
Or, using docker
singularity pull docker://samuelcolvin/tensorflow-gpu-py36
You can build your image (e.g. cuda.img) from shub or docker in HPC as a user
singularity build cuda.img shub://ucr-singularity/cuda-12.0-base:latest
If you build the docker containers locally on your development machine you can move them to the HPC and import/run them with the Singularity runtime. For example, you would export the container to a tar file and copy it to the HPC:
docker save -o mycontainer.tar ${container or image id}
scp mycontainer.tar <caseID>@hpctransfer1.case.edu:./
And then from a terminal n the HPC, convert the docker format to singularity:
module load singularity
singularity build mycontainer.sif docker-archive://mycontainer.tar
Case Study: Gromacs-Plumed
Though you may get the error like "Error :[gput027:28538] [[18950,0],0] ORTE_ERROR_LOG: Not found in file plm_slurm_module.c at line 450" using multiple nodes, you should be able to run MPI Gromacs-Plumed requesting multiple processors in a single node.
Pull the image from Docker:
singularity pull docker://rinnocente/gromed-ts-revised-2019:latest
Run the job by using job script template below
#SBATCH -N 1 -n <# of CPUs> --mem=<memory-size>gb
cp -r <topology-and-data-files> $PFSDIR
cd $PFSDIR
singularity exec -B /scratch ~/<path-to-image>/gromed-ts-revised-2019_latest.sif mpiexec -np <# of CPUs> /usr/local/gromacs/bin/gmx_mpi mdrun -nb auto -s <TPR file> -nsteps <steps> -plumed <plumed-input-data>
Installing Singularity in your PC - https://sylabs.io/guides/3.6/admin-guide/admin_quickstart.html#installation-from-source
yum update -y && \
yum groupinstall -y 'Development Tools' && \
yum install -y \
openssl-devel \
libuuid-devel \
libseccomp-devel \
wget \
squashfs-tools
wget https://dl.google.com/go/go1.12.5.linux-amd64.tar.gz
tar xzvf go1.12.5.linux-amd64.tar.gz
export GOPATH=/usr/local/src/singularity/go/go
export PATH=/usr/local/src/singularity/go/go/bin:$PATH
export VERSION=3.2.0
wget https://github.com/sylabs/singularity/releases/download/v${VERSION}/singularity-${VERSION}.tar.gz
tar -xzf singularity-${VERSION}.tar.gz cd ./singularity
./mconfig --prefix=/usr/local/singularity/3.2.0
make -C ./builddir
sudo make -C ./builddir install
Very Important: You need to have admin access in your PC. You can't install and manipulate your image in HPC but you can just run it.
You can either build the sandbox directory directly from docker container
sudo /usr/local/singularity/3.5.1/bin/singularity build --sandbox /mnt/vstor/RCCIDATA/singularity/rapidsai/ docker://rapidsai/rapidsai
Or, you can convert existing image to sandbox
sudo /usr/local/singularity/3.5.1/bin/singularity build --sandbox niftynet niftynet.img
Shell to the sandbox directory (e.g. niftynet) and install the required (dependency) packages for the container
sudo /usr/local/singularity/3.5.1/bin/singularity shell -w niftynet/
If you are using any other mount points, use -B option e.g. -B /mnt.
Install the Required Packages and exit from the shell (e.g. install tensorflow-gpu package)
pip install tensorflow-gpu==1.12.2
exit
Create the image as a root from the sandbox
sudo /usr/local/singularity/2.5.1/bin/singularity build niftynet.img niftynet
You can create your won singularity definition file or find one in GITHUB and build a container out of it. You need to be root. Check the def file as a reference in /usr/local/singularity/definitions/<application>/<application-def-file>.def.
sudo /usr/local/singularity/2.5.1/bin/singularity build opensees.img /usr/local/singularity/definitions/opensees/opensees.def
Troubleshooting:
If you get Setting Locale failed, you need to update and install language-pac-en-base
apt-get update
apt-get install language-pack-en-base
Issue with pulling container?
singularity cache clean
References:
[1] (i) Singularity Home (ii) SYLABS
[3] Creating Singularity Image
[4] Bootstrapping Singularity Image
[7] Singularity in Slurm Cluster
[9] Tensorflow - MNIST Tutorial
[10] HPC @ NIH
[11] OpenSees def file
[12] Images in Docker Container
[13] Anaconda/Keras: https://github.com/Drunkar/dockerfiles/tree/master/anaconda-tensorflow-gpu-keras
[14] All-in-One Docker Image: https://github.com/floydhub/dl-docker
[15] Singularity Docker: http://singularity.lbl.gov/docs-docker
[16] All Available Docker Images for Tensorflow: https://hub.docker.com/r/tensorflow/tensorflow/tags/
[17] Neural Machine Translation (NMT)
[18] Singularity Hub (shub) - https://singularity-hub.org/
[19] Docker Hub (docker) - https://hub.docker.com/