PBS Scheduler basics

 

Zenobe supercomputer uses the Portable Batch System (PBS) from Altair for job submission, job monitoring, and job management. Current release is 13.1.2.

Batch and Interactive jobs are available. Interactive jobs can be particularly useful for developing and debugging applications.

Basic Commands

The most commonly used PBS commands, qsub, qdel, qhold, qalter and qstat are briefly described below. These commands are run from the login node to manage your batch jobs.
See PBS Reference manual for a list of all PBS commands

qsub

The qsub is used to submit a job to the batch system.

  • To submit a batch job script:
    qsub [options] my_script
  • To submit a batch job executable:
    qsub [options] my_executable [my_executable_arguments]
  • To submit a interactive job:
    qsub -I [options]
  • To submit a job array:
    qsub -J <num-range> [options] script or executable

Most common options:

Input/output

-o path

standard output file
-e path path standard error file
-j oe (eo) joins standard error to standard output (standard output to standard error). oe is the default.
Queue
-q <queue_name> runs jobs in queue <queue_name>
Notification
-M email address notifications will be sent to this email address
-m b|e|a|n notifications on the following events:
begin, end, abort, no mail (default)
Do not forget to specify an email address (with -M) if you want to get these notifications.
Resource
-l walltime=[hours:minutes:]seconds requests real time; the default is 12 hours.
-l select=N:ncpus=NCPU requests N times NCPU slots (=CPU cores) for the job (default for NCPU: 1)
-l select=N:mem=size requests N times size bytes of memory for each chunk (default is 1GB).
-l pmem=size request a maximum of size bytes of memory for all processes of the job.
-l model=<model_type> request fit, fat, xfat Haswell nodes when allowed in the queue.
-l place=  chooses the sharing, grouping and the placement of nodes when it is allowed in the queue (default is free).
Dependency
-W depend=afterok:job-id starts job only if the job with job id job-id has finished successfully.
Miscellaneous
-r y|n

notifies that job is rerunnable (default no)

-v  specifies the environment variables and shell functions to be exported to the job.
-V Declares that all environment variables and shell functions in the user's login environment where qsub is run are to be exported to the job.
qdel

To delete a job:
qdel <jobid>

qhold

To hold a job:
qhold <jobid>

Only the job owner or a system administrator can place a hold on a job. The hold can be released using the qrls <jobid> command.

qalter

The qalter command is used to modify attributes of one or more PBS queued (not running) jobs. The options you can modify are a subset of the directives that can be used when submitting a job. A non-privileged user may only lower the limits for resources.
qalter [options] <jobid>

qstat

To display queue information:
qstat -Q

Common options to display job information :

  • -a Display all jobs in any status (running, queued, held)
  • -r Display running jobs
  • -u <username> Display user username jobs
  • -f <jobid> Display detailed information about a specific job
  • -xf <jobid> Display detailed information about a finished specific job (within past 48 hours)
  • -T Display estimated start time
  • -w Display information in a wide format

PBS Environment Variables

Several environment variables are provided to PBS jobs. All PBS-provided environment variable names start with the characters "PBS_". Some start with "PBS_O_", which indicates that the variable is taken from the job's originating environment (that is, the user's environment).

A few useful PBS environment variables are described in the following list:

PBS_O_WORKDIR Contains the name of the directory from which the user submitted the PBS job
PBS_O_QUEUE Contains the queue name
PBS_JOBID Contains the PBS job identifier
PBS_NODEFILE Contains a list of nodes assigned to the job
PBS_JOBNAME Contains the job name