Setting up a cloud computing lab on AWS

I recently needed to run a tutorial at a winterschool in computational chemistry.  Setting up a an impromptu computing lab can be a bit of a hazzle if:

i)    using a local computer lab one is not familar with
ii )  or each student is using their own laptop

It is thus convenient if one could set up a cloud computing solution that one could manage oneself and then the local computer or student-laptop just needs to be able to make an SSH connection to the cloud computing instance.

A cloud computing solution allows one to create as many Linux instances in the cloud as required (e.g.  1 for each student). One can easily create multiple instances simultaneously and then have the students log in to each instance easily via SSH and a private key.  The same private key for all students is fine for a 1-day session but for a longer lab one would obviously want to have individual private keys. One can configure each instance regarding OS (e.g. Ubuntu), storage, RAM and CPUs. 

However, one must be careful about the cost associated if quality instance machines are needed (CPU/RAM) and how long they run for. I went for Amazon AWS EC2 services.

Useful pages on EC2

https://aws.amazon.com/pm/ec2 

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html


AWS info for classrooms:

https://docs.aws.amazon.com/whitepapers/latest/setting-up-multi-user-environments/introduction.html

https://docs.aws.amazon.com/whitepapers/latest/setting-up-multi-user-environments/scenario-1.html

https://docs.aws.amazon.com/whitepapers/latest/setting-up-multi-user-environments/scenario-2.html


Set up

Warning: Careful to stick with the free-tier instances (should be either t2.micro or t3.micro and it should be labelled clearly) to begin with to avoid any costs upon creating each instance. If you select other instances  then you are outside the free-tier and then the more CPU/RAM per instance,   the higher the hourly costs.

"Free tier: In your first year includes 750 hours of t2.micro (or t3.micro in the Regions in which t2.micro is unavailable) instance usage on free tier AMIs per month, 30 GiB of EBS storage, 2 million IOs, 1 GB of snapshots, and 100 GB of bandwidth to the internet."

A t2.micro/t3.micro comes instance comes with 2 CPUs which should be sufficient for each student for simple preliminary exercises. It also comes with 1 GB RAM which is hopefully sufficient but could be tricky. If more RAM is required then one needs to go outside the free-tier. Storage also costs (needed for programs and for exercises) so that needs to be taken into account.


Install on each machine

The purpose was to run ASH on each machine, requiring Python,Numpy,OpenMM, xTB and a few other programs.

This was accomplished by copy-pasting the snippet below which should easily install and setup each new machine without any user-intervention needed (takes a few minutes).


#Download Miniforge (mamba,conda)

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh

#Install without prompt

sh Miniforge3-Linux-x86_64.sh -b

#Initialize shell

~/miniforge3/bin/mamba init

#Delete

rm Miniforge3-Linux-x86_64.sh

#Reset shell

source ~/.bashrc

#Installation: -y installs without confirm

mamba install -y openmm

mamba install -y xtb-python

mamba install -y pdbfixer

mamba install -y mdtraj

mamba install -y ipython

pip3 install geometric

pip3 install matplotlib

#Delete conda pkgs after installed (4 GB) due to limited space

 rm -rf ~/miniforge3/pkgs/*

#ASH

mkdir ~/ASH-code

cd ~/ASH-code

git clone https://github.com/RagnarB83/ash.git

cd ash

git checkout NEW

cd

#PATHS

echo "export PYTHONUNBUFFERED=1" >> ~/.bashrc

echo "export PYTHONPATH=$HOME/ASH-code:$PYTHONPATH" >> ~/.bashrc

source ~/.bashrc

#CALCULATION DIR SETUP

mkdir CALC_DIR

cd CALC_DIR