
AlphaFold is intended to provide accurate protein structure predictions[1]

In CWRU HPC resources, AlphaFold2 is only installed on the Pioneer ( cluster. It is installed locally, meaning that it is not required to use containers to run AlphaFold. An example slurm file is shown for requesting resources for a job using T-1050 as input.

Important Notes:

Quickstart on Running Alphafold2

Access  pioneer cluster ( - check Quickstart Guide.

Submit the job (one from the options below; check the comments):

sbatch /usr/local/software/AlphaFold/2.2.2/run_alphafold_monomer.slurm <your-seq-file>  # or monomer

sbatch /usr/local/software/AlphaFold/2.2.2/run_alphafold_multimer.slurm <your_seq_file> # for Full database search version using default GPUs

sbatch /usr/local/software/AlphaFold/2.2.2/multimer_reduced.slurm <your_seq_file> # for reduced database search version using GPUs

sbatch -C gpu4v100 --time=320:00:00 --gres=gpu:4 /usr/local/software/AlphaFold/2.2.2/multimer_reduced.slurm <your_seq_file> # for reduced database search version using customized resources - GPU volta with 4 GPUs and 320 hrs wall time) 

Code and Database organization

The code is installed in the optimied software tree:  /usr/local/easybuild_avx2/software/AlphaFold/2.1.1-fosscuda-2020b.

Running the code requires setting the MODULEPATH to include only the 'easybuild_avx2' optimization:

export MODULE_PATH=/usr/local/easybuild_avx2/modules/all

The database files are stored at /mnt/pan/AlphaFold.

Refer to documentation appropriate to your own data and objectives to determine which runtime flags to set when launching AlphaFold.  A simple example is offered here to illustrate the format of a job script used to allocate resources and run the AlphaFold code. The T1050 single-sequence data was obtained as a fasta file, and copied to $WORKDIR/fastas.

Batch Job Script


#SBATCH -p gpu

#SBATCH --gres=gpu:2

#SBATCH --mem=120gb

#SBATCH -c 20

#SBATCH -o quick_af.o%j

module purge

export MODULE_PATH=/usr/local/easybuild_avx2/modules/all

module load AlphaFold/2.1.1-fosscuda-2020b

export ALPHAFOLD_DATA_DIR="/mnt/pan/AlphaFold"


echo "$PFSDIR"

echo "$pwd"


echo "$lsADD"


cp -r $WORKDIR/fastas $PFSDIR


mkdir runs

echo Running AlphaFold from $WORKDIR

/home/mrd20/hpcdemo/alphafold/ \

          --fasta_paths=$PFSDIR/fastas/T1050.fasta \

          --data_dir=/mnt/pan/AlphaFold \

          --pdb70_database_path=/mnt/pan/AlphaFold/pdb70/pdb70  \

          --uniref90_database_path=$ALPHAFOLD_DATA_DIR/uniref90/uniref90.fasta   \

          --mgnify_database_path=$ALPHAFOLD_DATA_DIR/mgnify/mgy_clusters_2018_12.fa   \

          --uniclust30_database_path=$ALPHAFOLD_DATA_DIR/uniclust30/uniclust30_2018_08/uniclust30_2018_08  \

          --max_template_date=2020-05-14 \

          --db_preset=full_dbs \

          --output_dir=$PFSDIR/runs \



Directory Structure

$ ls -RF /scratch/pbsjobs/job.1184.hpc/


fastas/  runs/






features.pkl  msas/  result_model_1_ptm.pkl  result_model_2_ptm.pkl  unrelaxed_model_1_ptm.pdb  unrelaxed_model_2_ptm.pdb


bfd_uniclust_hits.a3m  mgnify_hits.sto  pdb_hits.hhr  uniref90_hits.sto


The numerous stages of the AlphaFold workflow will be reported. Typically sections involve:

Full sample output for T1050:  alphafold_T1050.out

