We no longer use condor on this system, we use wq
First a few commonly used command:
condor_q : list jobs in the queue
condor_submit {job_script} : submit the job specified in {job_script} to the queue
condor_rm {job_id} : remove a job from the queue
condor_status : check machine status; you can pipe to "grep astro" for example to shorten the list of machines
Standard Jobs
To submit a standard serial job, you crate a condor "submission file" and submit the job to the queue.
For example, if you submission file was called submitJob.condor you would type
condor_submit submitJob.condor
Here is a simple condor submission file for running a script called "test.sh" located in the
directory /astro/u/username/condor
Universe = vanilla
Notification = Error
GetEnv = True
Notify_user = username@bnl.gov
+Experiment = "astro"
Requirements = (CPU_Experiment == "astro")
Initialdir = /astro/u/username/condor
Executable = test.sh
Output = test.out
Error = test.err
Log = test.log
Queue
Of course, replace username with your username and make Initialdir point to
one of your directories. Initialdir will be the default directory for files
if you don't explicitly give full paths, e.g. test.out will go there.
Anything can go into the test.sh script, but you must make sure it is executable
chmod 755 test.sh
The standard output goes to test.out, standard error to test.err, and a log of condor
events goes to test.log
You can make as many of these scripts and condor submit files as you want and run
them all on the cluster. For example, if you have a bunch of scripts called myscript-XX.sh
where XX is a number like 00,01,02,03,04 to 99, you can have a corresponding set of condor submit
scripts called myscript-XX.condor. Then to submit them all
for i in $(seq -w 0 99); do condor_submit myscript-${i}.condor; done
MPI Jobs
MPI jobs are special. See http://en.wikipedia.org/wiki/Message_Passing_Interface for a definition of MPI.
To run an MPI job, first start an mpd ring, then submit the job using condor_submit {job_script}. Now, the
job script is different from the serial one. An intermedium script, which collects all the information needed
by the mpiexec command and sets up the necessary environment, is passed to the condor job script as
the Executable. The real executable for the job is passed as Arguments to the condor Executable
(the intermedium script). Luckily, Tom has made it easy for us and wrote the intermedium script. It is
attached as mp2script.sh. The job_script is attached as submitMPI.condor. To submit the job, all you
need to do is editing the file submitMPI.condor and run
condor_submit submitMPI.condor
One more note, currently, jobs are submitted from lsst01 (the head node). In the near future, when
lsst01 retires, astro0034 will take its place. The migration has been finished, so now astro0034 is the master.