RAxML

Application Software

"RAxML (Randomized Axelerated Maximum Likelihood) is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. It can also be used for post-analyses of sets of phylogenetic trees, analyses of alignments and, evolutionary placement of short reads."

For more detailed information about RAxML refer to:

Which RAxML to Use

There are currently three versions of RAxML available on the coeus cluster:

> module avail 2>&1 | grep RAxML

Biosciences/RAxML/8.2.11/gcc

Biosciences/RAxML/8.2.11/openmpi-2.0/gcc/hybrid

Biosciences/RAxML/8.2.11/openmpi-2.0/gcc/mpi

These are the sequential, MPI (message passing interface) parallelized version, and hybrid (MPI and PThreads) parallelized versions. The hybrid version uses MPI to distribute bootstrap replicates or independent tree searches to different shared memory nodes in a cluster while it uses PThreads to parallelize the likelihood calculations of single tree searches.

More information on the hybrid version can be found in this paper:

Hybrid MPI/Pthreads Parallelization of the RAxML Phylogenetics Code

The sequential version works best for small to medium datasets and for initial experiments. The MPI version works best for executing very large production runs i.e. large number of bootstraps.

For more detailed information about differences between versions refer to the manual.

To use RAxML on the Coeus cluster use the module load command followed by the appropriate version of RAxML. When you are ready to run your job use an sbatch script to run RAxML on compute nodes rather than running it directly on login nodes. Refer to the example sbatch scripts below. More information about using SLURM to submit jobs can be found on the SLURM Scheduler page.

Example SLURM sbatch Scripts

Basic sequential job with bootstraps

This job is based on an example from the Exelixis Lab

First create script sub_raxml_ex1.sh:

#!/bin/bash

#SBATCH --job-name RAxML_example

#SBATCH --partition medium

#SBATCH --output=raxml_bootstrap_%j

module purge

module load Biosciences/RAxML/8.2.11/gcc

srun raxmlHPC-AVX2 -m GTRGAMMA -p 12345 -N 100 -s dna.phy -n ex1

RAxML flags used:

The -m flag specifies the model of Binary, Nucleotide, MultiState, or Amino Acid Substitution
The -p flag specifies the random number seed, this must be specified for any option that requires randomization

Using the same seed will yield the same results

The -N flag specifies the number of alternative runs on distinct starting trees
The -s flag specifies the name of the alignment data file (in PHYLIP or FASTA format)
The -n flag specifies the name to be appended to output files and must be specified

Then submit this job to the SLURM scheduler using command sbatch:

> sbatch sub_raxml_ex1.sh

Using MPI version for bootstraps

This job is based on an example from the Exelixis Lab

#!/bin/bash

#SBATCH --job-name RAxML_MPI_example

#SBATCH --nodes 4

#SBATCH --partition medium

#SBATCH --output=raxml_bootstrap_mpi_%j

module purge

module load Biosciences/RAxML/8.2.11/openmpi-2.0/gcc/mpi

mpirun raxmlHPC-MPI-AVX2 -m GTRGAMMA -p 12345 -N 1000 -s dna.phy -n ex2

Differences to note:

Specify the number of nodes to use using #SBATCH --nodes
Command mpirun not srun

Using hybrid version

We don't currently have a working example for the hybrid version of this software

Report abuse