Coeus HPC Cluster

Getting Started 

Gaining Access and Connecting

Request an account

The Coeus cluster is a grant-funded resource provided through the Portland Institute for Computational Science and Portland State University.

First, you will need to make sure your compute jobs are either capable of parallelism or require significant HPC resources (as opposed to general linux compute servers).  To request cluster access, use this form.

Connecting and Logging in

Command Line Interface

If you are not accustomed to using a Linux command line interface (CLI), we recommend familiarizing yourself with introductory material such as this book, https://sourceforge.net/projects/linuxcommand/files/TLCL/19.01/TLCL-19.01.pdf/download

or http://www.pcworld.com/article/214370/12_commands_every_linux_newbie_should_learn.html.  The ability to navigate and manage files at the Linux command line is important in order to work effectively.

Secure Shell (SSH) client 

To connect to these servers you will need to use Secure Shell (ssh) run through a terminal emulation client application.  If you use Linux or MacOS X terminal applications are included with the operating system.  Windows users will need to download a client such as PuTTY.  Secure Shell is a standard, encrypted means of connecting to remote servers.   This OIT FAQ Secure Shell (SSH) explains how to connect on Windows and OSX.    These clients give you access to the Linux Command Line Interface (CLI). 

> ssh odinID@login1.coeus.rc.pdx.edu

> ssh odinID@login2.coeus.rc.pdx.edu 

File Transfer with sFTP or SCP

To move files to OIT-RC Linux systems you will have to use a secure File Transfer protocol such as sFTP, scp, or rsync.  There are many free graphical client programs such as WinSCP (compatible with PuTTY), Fugu for OSX, and CyberDuck and FileZilla for OSX and Windows.   scp and sFTP can be used from the linux and OSX CLI as well.

X Server

If you require a graphical interface (for example, to run MatLab with the graphic interface) you will need an X server.  There are excellent free X servers, such as XQuartz for OSX and Xming and MobaXterm for Windows.   Linux distributions will have native support, but you may need to install the proper packages and enable and configure the X Window System.  

To login to the Linux systems with the default Linux X Server, add the "-X" option to the end of a ssh command. To test this on Coeus, once logged in, type "xclock" to open a clock in a graphical interface.

> ssh odinID@login1.coeus.rc.pdx.edu -X

> xclock

Remote access to login nodes

Direct ssh access to the login nodes is limited to on PSU campus IP range (i.e. doesn’t include the guest wireless).

Your First Login

Automatic Environment Setup 

IMPORTANT: When you first log in to the Coeus cluster, your home directory will be automatically generated.  You will be guided through creating a proper environment for compiling and running a parallel program. Answer "yes" at the prompts. In the event that you say “no” or if you don’t get prompted with a setup prompt, the setup script needs to be run again.  You should run

> touch ~/.actrun 

then logout and then back in. 

This setup creates an SSH key to connect with cluster nodes for passwordless communications, adds /act/bin to the PATH variable, and adds the module command to the user's environment. 



Operating Environment

Coeus Home Directory (homedir - /home/odinid)

Your coeus home directory is separate from the general research home directory (used  for other systems in the PSU research computing infrastructure).  Separate home directories are used because different computational systems often require different local system settings.  Your coeus home directory will have the configuration files noted in the previous section, as well as any cluster-specific, custom settings you add.  For more information on /home/ and other file systems on Coeus, refer to the section on Filesystems and data storage below.

Login nodes

These are the servers where the users interface with the file system, scheduler, and other tools. The coeus login nodes are named:

Important! Do not run long computational jobs on the login servers.  These are for logging in, accessing your home directory, accessing file systems, writing and editing files, compressing and uncompressing data sets, compiling software, scheduling computational jobs, testing software, etc. Computational jobs will be run on computational nodes, through the SLURM job scheduler.   Long computational process running on login nodes, and any unscheduled jobs, are liable to be terminated without notification.

Modules

This cluster uses Linux environment modules to allow users to quickly update their environment, including execution paths, library paths, and manual paths, for specific software packages.  This will allow users to enable and disable software as needed.  For example, the Coeus cluster has module environments created for each MPI available implementation (openmpi, mpich, mvapich).  

Basic module usage

To obtain a complete list of all modules currently available on the system

> module avail

To load a module, e.g. GCC 6.3.0 compilers

> module load gcc-6.3.0

To load a module, e.g. MVAPICH 2.2.2 compiled with GCC 6.3.0.  (this will automatically load the gcc-6.3.0 module) 

> module load mvapich2-2.2/gcc-6.3.0

To obtain a complete list of currently loaded modules

> module list

Currently Loaded Modulefiles:

 1) gcc-6.3.0                2) mvapich2-2.2/gcc-6.3.0

To unload a module, e.g. MVAPICH 2.2.2 compiled with GCC 6.3.0.  (this will automatically unload the gcc-6.3.0 module, too) 

> module unload mvapich2-2.2/gcc-6.3.0

NERSC has an excellent Modules usage reference

Software

Modules load the selected software on each of these systems, mounted in the /vol/apps/hpc volume, where there is broad range of available software.  Some software in this volume include:

File Systems and Data Storage

Coeus Home Directory.  /home/odinid

Your home directory is on a shared filesystem that is mounted on all cluster nodes.  This should be used to store your batch scripts, system configurations, local compiled software, libraries, and config/settings files.  Home directories are backed-up to tape on a nightly basis.  Be advised that running calculations on data living in your home directory will be much slower, just use it to store backups of your data and do the computation on scratch storage.

Scratch storage.  /scratch

Data for your computational work should be put in scratch.  You can create your own personal and group project folders here.  This shared filesystem is mounted on all cluster nodes.  This is a large volume intended for temporary storage of data used in computational processes.  This volume is not backed up and all files stored here are considered to be temporary.  

Scratch is managed with on a modified First In, First Out policy.  The largest consumers of storage are prioritized for deletion and the oldest files are removed first.  Once this volume reaches a certain threshold, you may be asked to remove directories/files.  If this passes a critical threshold, system administrators reserve the right to remove all files.

Other Volumes

Applications Volume.  /vol/apps/

Common applications are stored in /vol/apps/hpc/stow.  This is mounted on all cluster nodes. This is the same applications volume as other OIT-RC systems.  This includes commonly used software such as R and Matlab, as well as variety of other tools for Bioinformatics, Genetics, GIS, all of which are loaded using modules.  This is a read-only volume.

GCC compiler versions can be found in /vol/apps/gcc/ and Python versions /vol/apps/python/.

Research shares.  /vol/share/sharename

Research storage shares are common to all OIT-RC systems.  These are only mounted on this cluster login nodes, in order to facilitate copying of data to the /scratch volumes.  /vol/share is a good place to move data that should be backed up, for example resultant data from computational runs. Do not run computational jobs against data stored on /vol/share. This volume is backed up.  (PSU access only)

Workspace scratch storage.  /vol/workspace

This scratch volume is common to multiple OIT-RC computational systems.  It is only mounted on Coeus login nodes, in order to facilitate copying of data to the /scratch volume.  Computational work is not allowed on login nodes.  Do not run computational processes against data stored on /workspace This volume is not backed up and all files stored here are considered to be temporary. 

Running Parallel Programs

SLURM Workload Manager

We use the SLURM Workload Manager for job control and management.  There are a number of  user commands for the scheduler.  For getting started, the most salient commands are  sbatch, squeue, scancel, sinfo, and srun.  A sample submit script and use of some of these commands is included in the section below “Compiling A Simple MPI Program.”  For more information visit the SLURM Quick Start User Guide.  This is a good, more detailed introduction to SLURM.

For more on SLURM parallelism, visit here.

For more on the SLURM Scheduler, refer to this page.

Partitions

There are many ways of dividing up and managing a cluster.  Partitions are a means of dividing hardware and nodes into useful groupings.  These hardware groups can have very different parameters assigned to them. Currently Coeus is divided into three general CPU node partitions, one aggregate CPU partition, a Intel Phi processor partition, a large memory partition (with GPUs), and a GPU partition.  Note that these partitions and parameters may change in the future as demand requires.

The sinfo command will display an overview of partitions. 

Compiling A Simple MPI Program 

This is an example session where a simple MPI  “Hello World” program is compiled and run.  This assumes this program file named mpi_hello.c, uses the mpich MPI library, the submission script is mpi_hello_submit.sh, the job is submitted to the “short” partition, and the output goes to a file named mpi_hello.txt.

The program file -  mpi_hello.c.

#include <stdio.h>

#include <mpi.h>


int main(int argc, char ** argv) {

   int rank, size;

   char name[80];

   int length;


   MPI_Init(&argc, &argv); // note that argc and argv are passed

                           // by address

                           //

   MPI_Comm_rank(MPI_COMM_WORLD,&rank);

   MPI_Comm_size(MPI_COMM_WORLD,&size);

   MPI_Get_processor_name(name,&length);


   printf("Hello MPI: processor %d of %d on %s\n", rank,size,name);

   MPI_Finalize();

}


To compile the program mpi_hello (assuming you have created the sample program)

$ module load openmpi-3.0.1/gcc-9.2.0

$ mpicc -o mpi_hello mpi_hello.c  

Scheduler submission script  - submit_mpi_hello.sh." 

#!/bin/bash

#SBATCH --job-name mpi_hello

#SBATCH --nodes 2

#SBATCH --ntasks-per-node 2

#SBATCH --partition short

#SBATCH --output mpi_hello.txt


module load openmpi-3.0.1/gcc-9.2.0

mpiexec ./mpi_hello

# run sleep for 20 sec. so we can test the 'squeue' command

srun sleep 20

Submit the program mpi_hello to the SLURM scheduler (assuming you have created the sample program and submit script)

$ sbatch submit_mpi_hello.sh 

The “squeue” command should now show a running job.

$ squeue

   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)

     348    short mpi_hell     will  R       0:14     2 compute[127-128]


After this runs, listing the directory contents should now has a the C code file, program, submission script, and  output file.

$ ls 

mpi_hello  mpi_hello.c  mpi_hello_submit.sh  mpi_hello.txt 

The output file will show the nodes and cores that it ran on. 

$ cat mpi_hello.txt

Hello MPI: processor 0 of 4 on compute127.cluster

Hello MPI: processor 1 of 4 on compute127.cluster

Hello MPI: processor 2 of 4 on compute128.cluster

Hello MPI: processor 3 of 4 on compute128.cluster

If a job does a lot of the same thing, like run the same calculations on different inputs, it is highly recommended to use a job array.

For more examples of SBATCH scripts, please refer to the SLURM Scheduler page.

Coeus Priority Access

In addition to the Free access tier, there is now a Priority access tier making it possible for researchers to reserve dedicated computer time for their funded research needs. Details are available in OIT’s description of the High Performance Computing (HPC) Clusters service, including a link to the HPC Priority Access request form where researchers can engage with OIT to assess their HPC requirements in order to include funding for Priority access in future research grant proposals.

High Priority Partitions

After your request has been processed, you will be able to submit the job to the higher priority partitions. Jobs submitted to these partitions will preempt the jobs in regular tier. More details on the node specification can be found here. Maximum job runtime on these partition is 20 days. Send a request to help-rc@pdx.edu if you need to extended the runtime of your job beyond the maximum time limit.

Submit a High Priority Job

To submit a job to a high priority partition following SLURM parameter has to be passed, for regular compute nodes:

--partition priority_access

for himem nodes:

--partition priority_access_himem

or for gpu jobs:

--partition priority_access_gpu