Update in Jan 2014: This is an old document. Currently the login node is hpclogin.case.edu
Information Technology Services (ITS) significantly updated its High Performance Computing (HPC) cluster infrastructure to provide more online storage, faster computing, and increased energy efficiency. The HPC cluster consists of a collection of servers functioning as a single computing system. The cluster provides users with the ability to run parallel computational tasks requiring hundreds of simultaneous processors in support of their research projects. High-speed parallel disk storage was increased from 20 Terabytes to 70 Terabytes. The upgrade more than doubles overall computational performance to 12 trillion floating-point operations per second while sustaining a mild 12% increase in energy utilization for power and cooling.
Researchers in over 31 departments use the HPC cluster, with the largest use coming from Biochemistry, Physiology and Biophysics, Biomedical Engineering, Physics, and Chemistry. Dr. Calvetti and Dr. Turc in the Department of Mathematics require students in Applied Math classes to use the HPC cluster to learn how to do scientific computing. Dr. Mihos in the Department of Astronomy uses the HPC resource to simulate the evolution of galaxies and galaxy clusters. Dr. Shoham’s research in the Department of Biochemistry uses the HPC cluster to focus on drug discovery and drug screening to combat infectious diseases, heart disease, and cancer. Dr. Sichun Yang recently chose to join the Center for Proteomics and Bioinformatics in part because he was able to use the HPC resource on his first day at CWRU and avoid months of non-productive time setting up his own computational facility for his research.
Find out more about the HPC resource by visiting links in our website.
Upgraded Cluster Architecture
What are the differences you should be aware of?
Temporary output files in details:
In the old cluster, users can access the temporary output files (*.OU and *.ER) in their own home directory. With this upgrade, we try to move away from this setup because this requires continuous writing of the temporary output files between the compute nodes and the home directory. While this can be a tiny file, it can also be a big file, so instead, these temporary files are stored locally on the compute nodes.
In the new cluster, users then can use qpeek <jobid> to access these local temporary output files (*.OU and *.ER).
$ qpeek 2967
comp073 <- host node
Mon May 16 13:00:05 EDT 2011 <- starting the job time
Only the person who is running the job can run this command successfully.
If the job is already finished, qpeek will give out an error.
If you need the output file page-by-page: qpeek <jobid> | more
If you want to save the output file, you can do file output redirect: qpeek <jobid> > mytempfile
Your other files will be located in your $PFSDIR or $TMPDIR depending on where you copy the files and your run location. If somehow you do not copy your files to a different location, your files will be located in your working directory $PBS_O_WORKDIR.
As a refresher, these are the locations of the temporary directories:
$PFSDIR=/scratch/pbsjobs/pbstmp.<jobid>.mast from any nodes
$TMPDIR=/tmp/pbstmp.<jobid>.mast on the first local compute node where the job is running