Installing Python Packages

Synopsis

System wide packages reside in locations that are read-only.  The Python package installer pip installs packages in ~/.local/lib/ when --user option is specified. Alternatively, when a virtual environment is active, pip installs to the environment lib/ folder.  There are also options and environment variables that can be set to allow for customization and more complex setups.


Default User Location

The package install locations are specific to python major+minor versions, e.g.:

module load Python/3.7.4

pip install --user pyyaml

The above will install the package to ~/.local/lib/python3.7/site-packages/, after which we can import the package in Python 3.7 with no additional steps required to configure the environment.


Virtual Environment

An alternative to --user is to create a self-contained Python virtual environment:

module load Python/3.7.4

python -m venv my_venv

source my_venv/bin/activate

pip install pyyaml

deactivate

The above installs the package to my_venv/lib/python3.7/site-packages/, and we can then import the package Python 3.7 whenever the environment is active. For for information on using virtual environments on the HPC you can refer to the Virtual Environment guide.


Modifying Default User Install Location

We can specify a custom location for pip by setting the environment variable PYTHONUSERBASE to a writable location. E.g.,

export PYTHONUSERBASE=~/Pythonpackages

The above instructs pip --user to install packages to this location and you can import packages from this location as long as the variable is set.


Specify a Custom Install Location

Rather than using an environment variable, we can specify a custom location in the install command with the pip -t option:

pip install -t ~/Pythonpackages pyyaml

When issuing the above, pip will create the directory if it does not exist, but will not create version specific library directories under the specified directory.

In order to import from this location,  we must add the location to PYTHONPATH environment variable, e.g. by exporting it prior to launching Python:

export PYTHONPATH=~/Pythonpackages:$PYTHONPATH


Install in Container

If we encounter a situation where we need something truly isolated, for example, we need a different version of libc.so than what the operating system provides, we can build a highly customized container environment:

salloc -c 8 --mem=24g srun --pty /bin/bash

cd $TMPDIR

module load singularity

singularity build --sandbox ./python3 docker://python:3.10

singularity shell --writable ./python3

Singularity> pip install -t /opt/mylib pyyaml

Singularity> chmod -R o+rX /opt/mylib

Singularity> exit

singularity build my_py3.sif ./python3

cp my_py3.sif ~/

rm -rf ./python3 my_py3.sif

We can then use our container to run interactive or batch jobs with the highly customized Python environment.


Invoke Interactively

salloc -c 8 --mem=24g srun --pty /bin/bash

module load singularity

singularity run --env PYTHONPATH=/opt/mylib my_py3.sif python3

> import yaml


Invoke in Batch Job

Python Script my.py

import yaml

f={"hello": "world"}

print(yaml.dump(f))

Job File my_job.sh

#!/bin/bash

module load singularity

singularity run --env PYTHONPATH=/opt/mylib my_py3.sif python3 my.py

Job Submit Command

sbatch my_job.sh 


Installing Python Package From Source

Beyond packages available in PyPi, we may want to install directly from the author, such as a new release, or the bleeding-edge current state of the source:

Archive File

wget https://github.com/yaml/pyyaml/archive/refs/tags/6.0.2.tar.gz

tar -xzf 6.0.2.tar.gz; cd pyyaml-6.0.2

pip install --user .

GIT Repo

git clone https://github.com/yaml/pyyaml.git

cd pyyaml

rm -rf .git

pip install --user .