Python Virtual Environments
Description
Python Virtual Environments help you to create isolated environments for running Python.
How can Virtual Environments help me?
In the HPC cluster we provide you with a Python module with a few, optimized, libraries such as NumPy, SciPy, Pandas, etc. However, these libraries might not have the correct versions for the package that you want to install because they might be too new or too old. A Python virtual environment can help you in this situation.
Another common situation appears when installing Python modules. Sometimes, we download a module, and that module calls pip to install the required dependencies. And because of how we have set up permissions in HPC, you usually face a Permission denied error. Here, a Python virtual environment would solve that problem as it would allow you to use pip as if you were root.
When Should I Create a Python Virtual Environment?
Python Virtual Environments are a great way to overcome the complicated installs, or the mismatched versions of packages. However, there is a toll one needs to pay: user space. Depending on how big your install is, it could take some significant space of your user/group quota. In addition, if you install the generic packages used for numerical computations using pip, you will lose performance.
So when should I create a Python Virtual Environment?
Create a Python Virtual Environment if:
The provided package versions do not satisfy your requirements: then create a blank environment and install all the libraries you need.
If you need to install different packages that require different versions of the same package: since one cannot install two versions of the same package, you can create two separate environments to solve that problem.
If the versions of the provided packages match and you just want to install extra packages whenever they are needed using pip: in this situation you can take advantage of our compiled and optimized modules as long as they are not updated.
Do NOT create a Python Virtual Environment if:
You can install the Python module in user space (i.e. pip install --user <package>). In this situation you would save a lot of space performing a user space install rather than creating a virtual environment.
Create a Virtual Environment
Important Notes
If you have set the PYTHONPATH variable, you will need to unset it to avoid any conflicts with packages. You can do that on a bash shell with:
unset PYTHONPATHWe recommend using Python from a module rather than the system Python as there can be slight differences in the system Python between machines.
Python changed the name of the module used for virtual environment creation between versions 2 and 3, so the directions are slightly different.
To avoid the big footprint in your home directory, create a python virtual environment in other location if you have any. For example, for the class, it becomes:
python3 -m venv /mnt/pan/courses/<PI_course>/<caseID>/p3venv
Load Python Module
Load the module for your target version of Python, e.g.:
module load Python/3.8.6-GCCcore-10.2.0
Python3: Create Environment
To create a virtual environment named "p3venv" in the current working directory e.g. home directory:
python3 -m venv p3venv
This command generates no output, but creates the directory p3venv inside the current working directory. Inside p3venv, there are symlinks to Python binaries, and new versions of pip so that you can install packages into the virtual environment.
Using a Virtual Environment
To use a Python virtual environment named p3venv that resides in the current working directory (e.g. home directory), we will use the following command:
source ~/p3venv/bin/activate
The command is the same for virtual environments created by Python versions 2 or 3. Once the environment is active, you can install packages to the environment using pip:
pip install pyyaml
Once you are done using the virtual environment, deactivate it:
deactivate
Creating a Consistent Virtual Environment for a Job Submission
Make sure you load the same modules to create the environment as you will load in your Slurm file. The order would be:
1) Load modules
2) Create virtual environment
3 Activate virtual environment
3) Install packages etc
4) Test interactively
5) Deactivate
In your Slurm script you would need to add (1) and (3) so that the environment is identical when you try to run the python code in the batch job.
Virtual Environment Kernels for Jupyter
There is a method described at the embedded link to prepare a virtual environment for use within applications supporting Jupyter notebooks: https://ipython.readthedocs.io/en/latest/install/kernel_install.html#kernels-for-different-environments
In brief summary,
source venv/bin/activate
pip install ipykernel
python -m ipykernel install --user --name venv --display-name="My Home VirtEnv"
Then open a Jupyter notebook Ondemand session, and under "New" should appear the option to select "My Home VirtEnv".
Removing a Virtual Environment
The virtual environment is self-contained, so it is safe to just delete the directory if you no longer need it. For example, a virtual environment named p3venv in the current working directory (e.g. home directory) can just be recursively removed:
rm -rf ~/p3venv
Including System Packages
To create a virtual environment that includes system packages in its search path, we can specify the flag --system-site-packages when the virtual environment is created:
python3 -m venv --system-site-packages p3venv
The environment created will now be able to import packages from the system library, which can save considerable space if some packages you require are already installed on the system.