Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0
Wiki Markup
The purpose of the queuing system is (1) to promote the efficient utilization of our computer facilities, and (2) to promote equity of access to those resources across our user community. This page is designed to help users get started with this system.

h4.

...

 How it works for users

...



Users should edit their shell scripts to add special directives to the queue system, beginning with "#PBS", that request resources, declare a required walltime, and direct standard output and error. Users can then "submit" their job using commands like "qsub," as described in the [next section|#command_table]. Advanced details can be found in [Section 2.1|http://www.clusterresources.com/products/torque/docs/2.1jobsubmission.shtml] of the Torque user manual.

...



A simple example of such a script can be found in the

...

 attached *[Script Library^apoa.sh]*, which runs the [ApoA1 benchmark|http://ftp.ks.uiuc.edu/Research/namd/performance.html]. Other examples have been posted on the [Script Library|Script Library#queue] page, including an example documenting how to submit multiple similar jobs, using Torque's JobID Array functionality.

...



A job name is assigned with a "#PBS \-N" statement, the destination queue is specified using a "#PBS \-q" statement:

...


{code
}
#PBS -N solution_equilibration_273K
{code}
{code
}
#PBS -q short
{code}

The standard output and standard error can be directed to files using the "#PBS \-o" or "#PBS \-e" directives. These two streams can be joined using a "#PBS \-j oe" directive.

...


{code
}
#PBS -e solution_equil.err
#PBS -o solution_equil.log
{code}

Users can request resources using a "#PBS \-l" statement. Resources include the walltime (in mm:ss or hh:mm:ss format) and the number of nodes and number of processors per node. In the example below, several _alternative_ examples of node requests are given to illustrate the possible syntax; only one would be included in an actual script.

...


{code
}
#PBS -l walltime=14:30:00
{code}
{code
}
#PBS -l nodes=1:ppn=4:group0
     OR
#PBS -l nodes=1:ppn=8:group0
     OR
#PBS -l nodes=n024:ppn=8
     OR
#PBS -l nodes=1:ppn=8+nodes=1:ppn=4
     OR
#PBS -l nodes=n024:ppn=8+nodes=1:ppn=8
     OR
#PBS -l nodes=1:ppn=8:mem16
{code}

Multiple job submission through job arrays is available on Cyrus1, but not Quantum2 (which has an earlier version of Torque. This feature can be especially helpful to users who need to submit a large number of similar jobs. For more details, see section 2.1.1 of the [Torque user manual

...

Some or all of these arguments can also be given at the command line. Command-line settings override any settings in the script.

Code Block
|http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml].

Some or all of these arguments can also be given at the command line. Command-line settings override any settings in the script.
{code}
[bashprompt]$ qsub -q short -l walltime=5:00:00 -l nodes=2:ppn=8 -e test_stderr.txt  ./test_simulation.sh
{code}

Some notes and suggestions for users:

...


# Users should request the lowest possible walltime for their jobs, since the queue system will need to "block out" the entire 24-hour period when no walltime is specified. This is analogous to a customer arriving at a busy barbershop and explaining that he only needs a "very quick trim."

...


# If users do not specify a queue in their script, the "default" queue is used. This queue has a walltime limit of 24 hours, a node count limit of 1, and a default priority of 60; in other words, it is less desirable than any of the queues listed in the [tables below|#queue_tables]. Specifying a queue with the "#PBS \-q" option is a good idea.

...


# Short "analysis scripts" are not exempt from the queuing system. If your script runs anything other than Linux utilities (e.g. VMD, CHARMM, custom C code for MD analysis), it belongs in the queue system. Just add a few lines as discussed in the section above.

...


# Input and output to files do not seem to be as immediate as when running directly from the command line. Users should not count on immediate access to program output.

...


# Users should test system scaling before expanding beyond one node; for systems of 10 to 20 katoms, poor scaling has been observed beyond 8 ppn, while the 92 katom ApoA1 benchmark case scales well to 2 nodes (and possibly beyond).

...



{anchor

...

:command_table

...

Managing jobs in the queuing system, and the approach used by Torque

The following commands can be used to submit and manage jobs:

...

command

...

purpose

...

qsub jobscript

...

Submit job in script jobscript. Can accept other arguments as discussed above.

...

}

h4. Managing jobs in the queuing system, and the approach used by Torque

The following commands can be used to submit and manage jobs:
|| command || purpose ||
| qsub _jobscript_ | Submit job in script _jobscript_. Can accept other arguments as discussed above. |
| qsub \-I \-l nodes=1:ppn=4

...

 | Request interactive job with indicated resources.

...

qdel jobID

...

Delete job number jobID. Seems to kill processes on compute nodes cleanly.

...

showq

...

list jobs by state, with scheduling details.

...

showbf

...

list "backfill," or number of processors available at the moment.

...

qstat

...

List active jobs. Use "-i" for a different format.

...

qstat -q

...

List queue attributes for all queues.

...

qstat -f jobID

...

List detailed information for job number jobID.

...

qnodes

...

List all nodes and their state and properties.

...

qnodes -l down

...

List those nodes currently down.

...

qnodes -l active or qnodes -l active

...

List nodes currently used for jobs.

...

qnodes -l free

...

List nodes currently free.

...

qmgr -c "print server"

...

Print queue configuration details

...

administrative command

...

purpose

...

pbsnodes -o nodename

...

Take node offline after allowing current job to finish.

...

pbsnodes -r nodename

...

Check node and then return to free status if possible.

...

checkjob jobid

...

Check status of job.

...

releasehold -a jobid

...

Release holds on job.

...

qrun jobid

...

Force job to execute now.

...

showstats

...

Show usage statistics.

...

showgrid statistic_name

...

Print a text table of statistic_name by time.

The PBS queue system allocates a set of nodes and processors to an individual job, either for the walltime specified in the job or the maximum walltime in the queue. It then provides a set of environmental variables to the shell in which the script runs, such as PBS_NODEFILE, the temporary node file describing allocated CPUs. When running with OpenMPI's mpiexec, the submitted script seems to just launch processes without needing that nodefile specified as an argument to mpiexec, although it's not clear whether that behavior is a feature or a bug.

Additional information for queue administrators on PBS and Maui can be found on a helpful page maintained by the University of Cambridge's Theoretical Chemistry department.

...

Table of queues attributes

The following tables were updated on June 11th to reflect changes made on that date. Note that a dash ("--") indicates no limit for the given queue.

Queue attributes on Cyrus1

 

debug

short

long

max walltime

20 min

24 hr

5 days

max nodes per job

1

5

1

max nodes per queue

10

max jobs running per user

5

max jobs queuable per user

-

priority

100

80

60

Queue attributes on Darius1, Darius2, Xerxes and Artaxerxes

...

 

...

debug

...

short

...

long

...

verylong

...

max walltime

...

20 min

...

24 hr

...

5 days

...

14 days

...

max nodes per job

...

1

...

32

...

4

...

4

...

max nodes per queue

...

...

...

15

...

5

...

max jobs running per user

...

...

...

15

...

5

...

priority

...

100

...

80

...

60

...

 |
| qdel _jobID_ | Delete job number _jobID_. Seems to kill processes on compute nodes cleanly. |
| showq | list jobs by state, with scheduling details. |
| showbf | list "backfill," or number of processors available at the moment. |
| qstat | List active jobs. Use "-i" for a different format. |
| qstat \-q | List queue attributes for all queues. |
| qstat \-f _jobID_ | List detailed information for job number _jobID_. |
| qnodes | List all nodes and their state and properties. |
| qnodes \-l down | List those nodes currently down. |
| qnodes \-l active _or_ qnodes \-l active | List nodes currently used for jobs. |
| qnodes \-l free | List nodes currently free. |
| qmgr \-c "print server" | Print queue configuration details |
|| administrative command || purpose ||
| pbsnodes \-o _nodename_ | Take node offline after allowing current job to finish. |
| pbsnodes \-r _nodename_ | Check node and then return to free status if possible. |
| checkjob _jobid_ | Check status of job. |
| releasehold \-a _jobid_ | Release holds on job. |
| qrun _jobid_ | Force job to execute now. |
| showstats | Show usage statistics. |
| showgrid _statistic_name_ | Print a text table of _statistic_name_ by time. |


The PBS queue system allocates a set of nodes and processors to an individual job, either for the walltime specified in the job or the maximum walltime in the queue. It then provides a [set of environmental variables|http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml] to the shell in which the script runs, such as PBS_NODEFILE, the temporary node file describing allocated CPUs. When running with OpenMPI's mpiexec, the submitted script seems to just launch processes without needing that nodefile specified as an argument to mpiexec, although it's not clear whether that behavior is [a feature or a bug|http://www.open-mpi.org/community/lists/users/2007/05/3179.php].

Additional information for queue administrators on [PBS|http://www-theor.ch.cam.ac.uk/IT/servers/maui/pbs.html] and [Maui|http://www-theor.ch.cam.ac.uk/IT/servers/maui/maui-admin.html] can be found on a helpful page maintained by the University of Cambridge's Theoretical Chemistry department.

{anchor:queue_tables}

h4. Table of queues attributes

The following tables were updated on June 11th to reflect changes made on that date. Note that a dash ("--") indicates no limit for the given queue.

*Queue attributes on Cyrus1*
|| || debug || short || long ||
| max walltime | 20 min | 24 hr | 5 days |
| max nodes per job | 1 | 5 | 1 |
| max nodes per queue | -- | -- | 10 |
| max jobs running per user \\ | -- | -- | 5 \\ |
| max jobs queuable per user | -- | -- | \- |
| priority | 100 | 80 | 60 |


*Queue attributes on Darius1, Darius2, Xerxes and Artaxerxes*
|| || debug || short || long || verylong ||
| max walltime | 20 min | 24 hr | 5 days | 14 days |
| max nodes per job | 1 | 32 | 4 | 4 |
| max nodes per queue | -- | -- | 15 | 5 |
| max jobs running per user | -- | -- | 15 | 5 |
| priority | 100 | 80 | 60 | 40 |