...
Users should edit their shell scripts to add special directives to the queue system, beginning with "#PBS", that request resources, declare a required walltime, and direct standard output and error. Users can then "submit" their job using commands like "qsub," as described in the next section.
A simple example of such a script can be found in the attached apoa.sh, which runs the ApoA1 benchmark.
...
- Users should request the lowest possible walltime for their jobs, since the queue system will need to "block out" the entire 24-hour period when no walltime is specified. This is analogous to a customer arriving at a busy barbershop and explaining that he only needs a "very quick trim."
- If users do not specify a queue in their script, the "default" queue is used. This queue has a walltime limit of 24 hours, a node count limit of 1, and a default priority of 60; in other words, it is less desirable than any of the queues listed in the tables below.
- Input and output to files do not seem to be as immediate as when running directly from the command line. Users should not count on immediate access to program output.
- Users should test system scaling before expanding beyond one node; for systems of 10 katoms, poor scaling has been observed beyond 8 ppn, while the 92 katom ApoA1 benchmark case scales well to 2 nodes.
Working with the queuing system, and the approach used by Torque
Anchor | ||||
---|---|---|---|---|
|
The following commands can be used to submit and manage jobs:
command | purpose |
---|---|
qsub jobscript | Submit job in script jobscript. Can accept other arguments as discussed above. |
qsub -I -l nodes=1:ppn=4 | Request interactive job with indicated resources. |
qdel jobID | Delete job number jobID. Seems to kill processes on compute nodes cleanly. |
qstat | List active jobs |
qstat -f jobID | List detailed information for job number jobID. |
qnodes | List all nodes and their state and properties. |
qnodes -l down | List those nodes currently down. |
qnodes -l active or qnodes -l active | List nodes currently used for jobs. |
qnodes -l free | List nodes currently free. |
qmgr -c "print server" | Print queue configuration details |
The PBS queue system allocates a set of nodes and processors to an individual job, either for the walltime specified in the job or the maximum walltime in the queue. It then provides a set of environmental variables to the shell in which the script runs, such as PBS_NODEFILE, the temporary node file describing allocated CPUs.
Anchor | ||||
---|---|---|---|---|
|
Table of Queues
The following tables are available in printer-friendly form in an attached file. Note that the settings can be adjusted to meet users' needs as those needs become clear.
...