computing‎ > ‎

Information about TIFR-CAM cluster

Hardware and software

 Feature Available
 Number of nodes 16
 Number of cores 104
 RAM 104 GB
 Interconnect Gigabit ethernet
 Operating system CentOS 5, Rocks 5
 Compilers gcc 4.1.2, gfortran 4.1.2, pgi 7.2-2
 MPI MPICH2, OpenMPI

Torque/PBS

The nodes are put into two groups, see the file /opt/torque/server_priv/nodes

 Group              Nodes Cores/node Total cores CPU
 nash c0-0 to c0-44 20 Dual-core AMD Opteron 2220 @ 1 GHz
 hardy c0-5 to c0-1480 Quad-core AMD Opteron 2352 @ 1.05 GHz 


PBS is a batch handling system to manage parallel applications submitted by users. On the cluster, PBS uses Maui as the scheduler. Jobs are submitted to PBS using a script; examples are given below under the openmpi and mpich2 sections. If the script is called famosa.pbs, you can submit the job to PBS using

$ qsub famosa.pbs

Here are some parameters that can be given in a PBS script file:

  * -N jobname (name the job)
  * -q @nic-cluster.cc.umr.edu (The cluster address to send the job to)
  * -e errfile (redirect standard error to a file named errfile)
  * -o outfile (redirect standard output to a file named outfile)
  * -j oe (combine standard output and standard error)
  * -l walltime=N (request a walltime of N in the form hh:mm:ss)
  * -l cput=N (request N sec of CPU time; or in the form hh:mm:ss)
  * -l mem=N[KMG][BW] (request total N kilo| mega| giga} {bytes|words} of memory on all requested processors together)
  * -l nodes=N:ppn=M (request N nodes with M processors per node)
  * -m abe (mail the user when the job aborts/began running/ended)
  * -S shell (use shell instead of your login shell to interpret the batch script; must include a complete path)
  * -V (job inherits the full environment of the current shell, including   $DISPLAY)

Once a job is submitted, you can check its status using qstat

[praveen@master piaggio_pso]$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
25.master                 20080919_3       roms            22:58:50 R default        
28.master                 FAMOSA           praveen         00:11:11 R default 

To get more detailed information, use qstat -f or qstat -f <jobid>

To delete a running job, use

$ qdel <jobid>

If the job is not killed by the above command, then force it using

$ qdel -p <jobid>

Note: qpeek was not working when torque was installed from Rocks 5. It would give an error that there is no file in /opt/torque/spool. After commenting line 142 in /opt/torque/bin/qpeek, it works.

Some PBS trouble shooting

Sometimes PBS or maui may not be started properly. To start PBS

/sbin/service pbs_server start

If PBS is working but job is being queued and never starts, then maui may not be running. Start it

/usr/local/maui/sbin/maui

If qstat does not show all jobs but only shows yours, then the settings have to be changed like this which needs root access

qmgr -ac "set server query_other_jobs = True"

Selecting MPI version

There are many versions of mpi installed. To see available versions

[praveen@turing ~]$ mpi-selector --list
mvapich2_gcc-1.6
mvapich_gcc-1.2.0
openmpi_gcc-1.4.3
openmpi_gcc44-1.4.3
pgimpi

I recommend using openmpi_gcc44-1.4.3 since it works well with PBS. You set this by

[praveen@turing ~]$ mpi-selector --set openmpi_gcc44-1.4.3
Defaults already exist; overwrite them? (y/N) y

You need to logout and login for the paths to be updated. You can check your mpi setting by

[praveen@turing ~]$ mpi-selector --query
default:openmpi_gcc44-1.4.3
level:user

Also check the PATH have been set correctly; you should get following output.

[praveen@turing ~]$ which mpirun
/opt/openmpi-1.4.3/bin/mpirun

OpenMPI

Openmpi was compiled with the following configure options

./configure --with-tm=/opt/torque --prefix=/opt/openmpi-1.2.7 \
                      --enable-prefix-by-default --enable-static

After compiling, check that all required features are enabled using ompi_info. In particular, to verify that torque support is built in, do

[praveen@master ]$ /opt/openmpi-1.2.7/bin/ompi_info |grep tm
              MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.7)
                 MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.7)
                 MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.7)

The following is an example PBS script for use with openmpi. Set PATH and LD_LIBRARY_PATH if they are not set in your .bashrc file. Here I have commented them.

#PBS -N "rae2822"
#PBS -l "nodes=5:hardy:ppn=6"
#PBS -l "walltime=48:00:00"
#PBS -j oe
#PBS -o famosa.log
#PBS -m e

#export OPENMPI=/opt/openmpi-1.2.7
#export PATH=$OPENMPI/bin:$PATH
#export LD_LIBRARY_PATH=$OPENMPI/lib

cd $PBS_O_WORKDIR

mpirun $HOME/src/famosa/build/bin/Famosa_mpi


MPICH2

Mpich2 is installed using Rocks in /opt/mpich2/gnu and uses gfortran as the fortran compiler. 

Using /opt/mpich2/gnu/bin/mpirun

#PBS -N "FAMOSA"
#PBS -l "nodes=5:hardy:ppn=6"
#PBS -l "walltime=00:10:00"
#PBS -j oe
#PBS -o "famosa.log"
#PBS -m e

export LD_LIBRARY_PATH=/opt/mpich2/gnu/lib
export PATH=/opt/mpich2/gnu/bin:$PATH

# got to working directory
cd $PBS_O_WORKDIR

# run mpd demon on all nodes
N_ALL=`cat $PBS_NODEFILE | wc -l`
N_UNI=`sort -u < $PBS_NODEFILE | wc -l`

cp $PBS_NODEFILE  ./nodes_all.txt
sort -u < $PBS_NODEFILE > nodes_unique.txt

mpdboot -n $N_UNI -f nodes_unique.txt
sleep 10
mpirun -n $N_ALL -machinefile nodes_all.txt ~/src/famosa/build/bin/Famosa_mpi
mpdallexit

Using /opt/mpiexec/bin/mpiexec

Use mpiexec in /opt/mpiexec to launch mpich2 programs together with PBS. An example script is given below

#PBS -N "rae2822"
#PBS -l "nodes=5:hardy:ppn=6"
#PBS -l "walltime=48:00:00"
#PBS -j oe
#PBS -o famosa.log
#PBS -m e

cd $PBS_O_WORKDIR

/opt/mpiexec/bin/mpiexec --comm=pmi $HOME/src/famosa/build/bin/Famosa_mpi


Useful commands

cluster-fork

This command can be use to execute something on all nodes. For example to see the list of processes for user praveen, do

cluster-fork ps -U praveen

To run some command only on a particular set of nodes, use

cluster-fork -n "c0-0 c0-1 c0-2 c0-3 c0-4" ps -U praveen

Another was is to use

cluster-fork --nodes="c0-%d:5-14" ps -U praveen

checkjob

This command gives some information about a submitted job

checkjob -v <JOBID>

where JOBID is given by qstat.

showq

showq gives a concise summary of all jobs running or in the queue.

showscript

showscript will return the contents of the PBS script that you have submitted. The only argument is the job's PBS jobid. 

mjobctl

You can use this to suspend or resume a PBS job. See the help

[praveen@master ]$ mjobctl --help
Usage: mjobctl [FLAGS]
  --about
  --configfile=<FILENAME>
  --format=<FORMAT>
  --help
  --host=<SERVERHOSTNAME>
  --keyfile=<FILENAME>
  --loglevel=<LOGLEVEL>
  --port=<SERVERPORT>
  --version

  -c <JOBID> // CANCEL
  -C <JOBID> // CHECKPOINT
  -h <JOBID> // HOLD
  -r <JOBID> // RESUME
  -R <JOBID> // REQUEUE
  -s <JOBID> // SUSPEND
  -S <JOBID> // SUBMIT
  -x <JOBID> // EXECUTE

Comments