Updated: April 1, 2018
Use qstat -f
more information on qstat here: http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/qstat.htm
d: disabled
r: running
E: error
a: alarm
u: unreachable
more information on queue states here: http://www.softpanorama.org/HPC/Grid_engine/Queues/queue_states.shtml
'dr' state: qdel -f <jobid>
- force kill a job (good for jobs in 'dr' state)
if the job still remains in 'dr' state have the administrator force kill the job
'E' state: qmod -c <queue name>
- clears the error state ('E') for a queue
'au' state: Attempt a soft PXE reboot of node, if that is not successful the node will have to be hard booted.
qmod [options] [job or queue]
'-d' disables the specified queue
'-e' enables the specified queue (changes state from 'd')
more information on qmod here: https://linux.die.net/man/1/qmod