Accounting

 

Jobs Accounting Metric

The jobs accounting is based on the R_Wall_Time metric that accounts for the resources actually mobilized by the jobs.

The metric definition is: R_Wall_Time = Wall_Time * ncpus_equiv_pjob

where:

  • Wall_Time is the execution time of the job (end_time - start_time - suspend_time);
  • and ncpus_equiv_pjob is defined as follows:
    • if the job is run in an exclusive queue or environment, complete nodes are associated to your job, and then
      ncpus_equiv_pjob = nodes_pjob * ncpus_pnode
    • else the job is run in a shared mode (sharing nodes with other jobs or projects) and then
      ncpus_equiv_pjob = max ( ncpus_pjob , mem_pjob / mem_pcpu )

in which:

  • ncpus_pnode is the number of cores available per node
  • ncpus_pjob and mem_pjob are respectively the total number of cores and total amount of memory requested by the job (respectively resource_list.ncpus and resource_list.mem reported in a qstat -f on the job) and
  • mem_pcpu is the amount of memory available per core depending on the type of node requested, i.e.
    • 2625MB by default (Haswell fit nodes)
    • 5250MB if you requested Haswell fat nodes ( -l model=haswell_fat )
    • 10500MB if you requested Haswell xfat nodes ( -l model=haswell_xfat )

ncpus_equiv_pjob is the resource_used.rncpus value reported in a qstat -f on the job.

See also Job Output file to check resource usage of the job.

Reservation or Dedicated Nodes Accounting

R_Wall_Time accounted is:
(Reservation end_time - Reservation start_time) * Number of Dedicated Nodes * ncpus_pnode.

Reporting

Reports are sent weekly and monthly to project managers. Persistent and scratch storage usages and some job statistics are provided with the R_Wall_time credit used during the period by queue and/or globally and the remaining credit.
Some ratios are also provided:

  • η = Total CPU_Time / ( Total Wall_Time * NCPUS_PJOB )
  • α = Total CPU_Time / ( Total Wall_Time * NCPUS_EQUIV_PJOB )

Example:

======================================================================================
Project created Friday Oct 31 2014
======================================================================================
SCRATCH storage quota    : 400 GiB
SCRATCH storage used     : 196 GiB
Persistent storage quota : 200 GiB
Persistent storage used  : 48 GiB

-------------------------------------------------------------------------------------
All Standard Queues Usage
-------------------------------------------------------------------------------------
R_Wall_Time Credit used this month : 10.6 hours

Job Usage Detail (times in hours)
             # of       Total     Total  Total                  Total         Average
Username     jobs    CPU_Time Wall_Time N_Wall_Time     η R_Wall_Time    α  Wait_Time
----------- ----- ----------- --------- ----------- ----- ----------- ----- ---------
TOTAL           3         8.7       1.0        10.6 82.1%        10.6 82.1%       0.0

user1           3         8.7       1.0        10.6 82.1%        10.6 82.1%       0.0

Job Set Summary (times in hours)
                        Percentil  Percentil  Percentil
               Minimum       P_50       P_75       P_95    Maximum
            ---------- ---------- ---------- ---------- ----------
Ncpus                1          2          2          2         32
CPU_time           0.0        0.0        0.0        0.0        8.7
Wall_time          0.0        0.3        0.3        0.3        0.7
Wait_time          0.0        0.0        0.0        0.0        0.0
-------------------------------------------------------------------------------------

R_Wall_Time Credit allocated            : 31000 hours
R_Wall_Time Credit Valid until          : Thursday Apr 30 2016
R_Wall_Time Credit used previous report : 20128.2 hours
R_Wall_Time Credit used this month      : 10.6 hours
R_Wall_Time Credit used current report  : 20138.8 hours
Remaining R_Wall_Time Credit            : 10861 hours
Percentage R_Wall_Time Credit left      : 35 %

 

Credit management

Project managers are kindly requested to monitor their project resources usage. Projects with a negative remaining credit will have one month to update their request; after this delay, jobs associated to these projects will be rejected. Academic users can always update their project credit via the following page https://login.ceci-hpc.be/init-project/. Other users can send an email to itatcenaero [dotcenaero] be.