Skip to content

Compute grid

We have a small cluster on which you can launch tasks. This grid uses slurm, which is the same system used at the Digital Research Alliance of Canada (formerly Compute Canada). Our resources are much more limited and the system configuration is different. However, the usage is very similar and most of the information on the wiki's task submission page is valid. https://docs.computecanada.ca/wiki/Running_jobs/en

The compute grid is accessible to all CIRRELT & GERAD members.

Restricted Acces

None at this time.

Billing

In the system, there is a billing of the resources consumed. This billing does not incur any charges to users but is used to determine what each has used and calculate task priority.

So if for example you request 8 processors, wether you use 8 or 1, as they have been reserved for you, they will be calculated as being used and invoiced.

Server Nodes

Here is a brief description of the machines that are available in the compute grid.

optimum

  • 11 Dell PowerEdge R740 machines
  • 512GB of memory
  • 2 Intel Xeon Gold 6258R CPU @ 2.70GHz (56 cores total)

These machines are split in 2 partitions:

  • optimum: tasks with a maximum duration of 48 hours. 10 machines;
  • optimumlong: tasks with a maximum duration of 7 days. 1 machine.

slurm

The slurm machine has it's own partition called testing. It should be used only to verify that the scripts work correctly before submitting the job to one of the other partitions. the available ressources on this machine are very limited since the goal is not to actually run the jobs but to debug scripts.

Task submission

For all slurm commands (squeue, sbatch, etc), you must connect by ssh to the slurm machine.

sinfo

This command allows you to see the state of the calculation grid and the maximum durations of the partitions. You can see for example if there are machines down, if there are free machines or if machines have been suspended from the system for maintenance.

PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
optimum* up 2-00:00:00 1 mix optimum01
optimum* up 2-00:00:00 9 idle optimum[02-10]
optimumlong up 7-00:00:00 1 idle optimum11

In this example, we see that all the machines are available and only 1 has tasks.

  • idle: machines are fully available
  • mix: machines are used partly
  • drain: machines are under maintenance

sbatch

Command to submit a job to the cluster. You can put the parameters in the submission script or pass them on the command line or do a combination of the 2.

To submit a job to the queue, you need to have a batch script that will specify the requirements of the jobs. you can also use command line options and combine with the parameters inside the script.

Here are the most used parameters:

  • --cpus-per-task: number of processors that will be used by the program. It is important that this number matches what the program does. Some software like cplex and gurobi try to use all processors unless they are restricted. In this example, we would have cplex which will run 56 parallel threads which will all run on a single processor which is not the right way to do it.
  • --mem: the memory the program needs. If this memory is exceeded, the task is cancelled.
  • --time: the time required to complete the task. If this limit is exceeded, the task is cancelled.
  • --output: to specify the name of the file that will contain the output of the program. By default, this will be related to the job number.
  • --partition: partition to use. By default, this will be optimum. This partition allows tasks up to 2 days. long partition allows tasks up to 7 days but, there is only one machine available in this group. testing is for debugging your scripts and make sure they work before using the other partitions.
  • --nodelist=: Use this option if you need to specify particular compute nodes on which to run. Multiple nodes can be separated with commas or specified as a list such as optimum[01-03]

The settings you choose will be used to see what resources are available to launch the task. Although there is no limit as such, the more you ask, the longer you will wait because it will be more difficult to obtain these resources.

Once your script is ready, you can use this command to submit it: sbatch instance.batch

Here's 2 script examples you can get inspiration from.

This script can be used to submit 1 job to the system.

#!/bin/bash
#SBATCH --cpus-per-task=1
#SBATCH --mem=50G
#SBATCH --time=1:00:00
#SBATCH --output=/dev/null
#SBATCH --partition=optimum

module load anaconda
conda activate
/usr/bin/time -v python memoire.py

sleep 60

If we have more than 1 job to submit, you should use the array option. This way, you submit 1 job but they will still be executed as separate tasks.

In this example, we have instances:

  • n = 2 values
  • dataset = 2 values
  • country = 2 values.

2 * 2 * 2 = 8.

this is why we use --array=1=8 in the header. It is really important that the numbers match if you want all jobs to run.

#!/bin/bash
#SBATCH --cpus-per-task=1
#SBATCH --mem=10G
#SBATCH --time=10:00
#SBATCH --array=1-8
#SBATCH --output=arrayjob_%A_%a.out

# chacune des tâches doit être capable de déterminer qui elle
# est en fonction de son indice dans le array slurm.
# pour cela, on fait toutes les combinaisons possibles
# et si l'indice concorde alors c'est nous et on
# exécute le programme avec les bons arguments.

i=1
for n in 1 2
do
for dataset in ab bd
do
    for country in canada france
    do
    if [ $SLURM_ARRAY_TASK_ID -eq $i ]
    then
        ./array.pl $n $dataset $country
    fi
    (( i = $i +1 ))
    done
done
done

sleep 60

Pour lancer un programme matlab dans la grille de calcul, il faut que votre programme ne soit pas en mode graphique et qu'il soit en mesure d'obtenir tous les paramètre dont il a besoin sans interaction et sans avoir à modifier le code.

#!/bin/bash
#SBATCH --cpus-per-task=1
#SBATCH --mem=16G
#SBATCH --time=10:00
#SBATCH --output=hello.out

module load matlab
matlab -batch hello

sleep 60

scancel

Program to cancel one or more tasks. Example:

scancel 1234
scancel -u user

This command allows to have more information on a specific task. Available fields can be displayed with the `-e` option and then used

squeue

This command is used to see the jobs that are in the system. By default, if you don't give any options, you'll see everyone's tasks. You can use the `-u` option to specify your username. You can use the `SQUEUE_FORMAT` variable to change the default view of the command. For example at the alliance they use:

export SQUEUE_FORMAT="%.15i%.8u%.12a%.14j%.3t%.10L%.5D%.4C%.10b%.7m%N (%r)"

Example commands:

squeue -u username displays jobs for the specified user
squeue -u username -t RUNNING displays currently running jobs
squeue -u username -t PENDING displays the jobs that are still on queue

squeue job states:

CA Canceled Job was explicitly canceled by the user or system administrator
CD Completed Job completed
F Failed Job terminated with a non-zero exit code
PD Pending Job awaiting resource allocation
R Running Job currently executing

sacct

This command can be used to get additionnal information on a specific task. The available fields can be seen by using the -e option and then used with the --format option.

You can also see the status of completed jobs if they are still in the database. To see older jobs, you can specify a start date.

sacct --starttime 2022-04-17 --format=Account,User,JobID,Start,End,AllocCPUS,Elapsed,AllocTRES%30,CPUTime,AveRSS,MaxRSS,MaxRSSTask,MaxRSSNode,NodeList,ExitCode,State%20

If you don't want to specify the format everytim, you can use the environment variable SACCT_FORMAT. The following format is used at the alliance.

export SACCT_FORMAT=Account,User,JobID,Start,End,AllocCPUS,Elapsed,AllocTRES%30,CPUTime,AveRSS,MaxRSS,MaxRSSTask,MaxRSSNode,NodeList,ExitCode,State%20

sstat

Similar to sacct but only for currrently running jobs and only works on your own jobs.

/scratch Glusterfs

There's temporary directory /scratch available on the optimum machines as well as on the slurm frontend. You can use this space to copy your datasets before submitting your jobs.

This space has no backup so don't use it to store files you can't afford to lose.

Please use this space reasonably.