Python Virtual Environments

Description

Python Virtual Environments help you to create isolated environments for running Python.

How can Virtual Environments help me?

In the HPC cluster we provide you with a Python module with a few, optimized, libraries such as NumPy, SciPy, Pandas, etc. However, these libraries might not have the correct versions for the package that you want to install because they might be too new or too old. A Python virtual environment can help you in this situation.

Another common situation appears when installing Python modules. Sometimes, we download a module, and that module calls pip to install the required dependencies. And because of how we have set up permissions in HPC, you usually face a Permission denied error. Here, a Python virtual environment would solve that problem as it would allow you to use pip as if you were root.

When Should I Create a Python Virtual Environment?

Python Virtual Environments are a great way to overcome the complicated installs, or the mismatched versions of packages. However, there is a toll one needs to pay: user space. Depending on how big your install is, it could take some significant space of your user/group quota. In addition, if you install the generic packages used for numerical computations using pip, you will lose performance.

So when should I create a Python Virtual Environment?

Create a Virtual Environment

Important Notes

Load Python Module

Load the module for your target version of Python, e.g.:

module load Python/3.8.6-GCCcore-10.2.0

Python3: Create Environment

To create a virtual environment named "p3venv" in the current working directory e.g. home directory:

python3 -m venv p3venv

This command generates no output, but creates the directory p3venv inside the current working directory. Inside p3venv, there are symlinks to Python binaries, and new versions of pip so that you can install packages into the virtual environment.

Using a Virtual Environment

To use  a Python virtual environment named p3venv that resides in the current working directory (e.g. home directory), we will use the following command:

source ~/p3venv/bin/activate

The command is the same for virtual environments created by Python versions 2 or 3. Once the environment is active, you can install packages to the environment using pip:

pip install pyyaml

Once you are done using the virtual environment, deactivate it:

deactivate

Creating a Consistent Virtual Environment for a Job Submission

Make sure you load the same modules to create the environment as you will load in your Slurm file. The order would be:

1) Load modules

2) Create virtual environment

3 Activate virtual environment

3) Install packages etc

4) Test interactively

5) Deactivate 

In your Slurm script you would need to add (1) and (3) so that the environment is identical when you try to run the python code in the batch job.

Virtual Environment Kernels for Jupyter

There is a method described at the embedded link to prepare a virtual environment for use within applications supporting Jupyter notebookshttps://ipython.readthedocs.io/en/latest/install/kernel_install.html#kernels-for-different-environments

In brief summary,

Then open a Jupyter notebook Ondemand session, and under "New" should appear the option to select "My Home VirtEnv".

Removing a Virtual Environment

The virtual environment is self-contained, so it is safe to just delete the directory if you no longer need it. For example, a virtual environment named p3venv in the current working directory (e.g. home directory) can just be recursively removed:

rm -rf ~/p3venv

Including System Packages

To create a virtual environment that includes system packages in its search path,  we can specify the flag --system-site-packages when the virtual environment is created:

python3 -m venv --system-site-packages p3venv

The environment created will now be able to import packages from the system library, which can save considerable space if some packages you require are already installed on the system.