Compute grid¶

The CIRRELT & GERAD provides you with a local computing cluster based on the Slurm workload manager, the same system used by the Digital Research Alliance of Canada (formerly Compute Canada). Although our resources are more limited and the configuration differs, usage remains similar. Most of the information available on the Alliance wiki job submission page therefore also applies to our cluster.

Rules for Using the Computing Cluster¶

Never run jobs directly on the slurm machine (it is reserved for job submissions).
Use the correct amount of memory for your jobs.
Use the appropriate number of CPUs for your jobs.
Jobs shorter than 10 minutes may be ignored by the system. Make sure your computations exceed this threshold.
Use the appropriate partition for your job:
optimum: For computations under 2 days (10 machines available).
optimumlong: For computations between 2 and 7 days (1 machine available).
testing: To validate your scripts before submission (max. 15 minutes).
Use your disk space under /scratch if your job performs a large amount of read/write operations or if you have large datasets.

Hardware Resources¶

Model	Memory	CPU (per machine)
Dell PowerEdge R740	512 GB	2 × Intel Xeon Gold 6258R (56 cores)

Note: The slurm machine may be used to test your scripts (using the testing partition), but its resources are very limited.

Job Management¶

To run any Slurm command (squeue, sbatch, etc.), you must connect via SSH to the slurm machine.

Cluster Status¶

The sinfo command shows the status of the computing cluster and the maximum time limits of the partitions.

PARTITION      AVAIL    TIMELIMIT   NODES  STATE  NODELIST
optimum*       up       2-00:00:00      1   mix   optimum01
optimum*       up       2-00:00:00      1   drain optimum02
optimum*       up       2-00:00:00      9   drng  optimum03
optimum*       up       2-00:00:00      9   idle  optimum[04-10]
optimumlong    up       7-00:00:00      1   idle  optimum11

idle: machine is completely available
mix: partially used
drain: under maintenance
drng: running tasks but not accepting new ones; will switch to drain when tasks finish

Task Submission¶

The sbatch command is used to submit a task to the compute grid. Parameters can be set in the submission script, passed on the command line, or a combination of both.

To run a task on the grid, you must have a script with Slurm parameters.

Here are the most commonly used parameters:

Parameter	Description	Example
`--cpus-per-task`	Number of CPUs allocated. Must match the actual needs of your program.	`--cpus-per-task=4`
`--mem`	Required memory. If exceeded, the task is canceled.	`--mem=16G`
`--time`	Maximum duration. Format: `DD-HH:MM:SS`.	`--time=2-12:00:00`
`--output`	Output file. Default: `slurm-<ID>.out`.	`--output=results.log`
`--partition`	Target partition (`optimum`, `optimumlong`, `testing`).	`--partition=optimumlong`
`--nodelist`	Specific nodes (e.g., `optimum[01-03]`).	`--nodelist=optimum01`
`--array`	Allows running identical tasks in parallel, each with a unique ID (`$SLURM_ARRAY_TASK_ID`). Avoids manually submitting each instance.	`--array=1-8`

Best Practices¶

Test your script with the testing partition before submitting to optimum.
Avoid overestimating resources: Higher demands result in longer wait times.
Limit CPU usage for software like CPLEX or Gurobi (use --cpus-per-task). Adjust the number of threads in your program accordingly.

When your script is ready, simply use the command:

sbatch instance.batch

Here are three script examples to get you started:

Single Task ExampleMultiple Tasks ExampleMATLAB Example

Script for a single instance to solve.

#!/bin/bash
#SBATCH --cpus-per-task=1
#SBATCH --mem=10G
#SBATCH --time=1:00:00
#SBATCH --output=/dev/null
#SBATCH --partition=optimum

source ~/.bashrc

module load python
source .venv/bin/activate
/usr/bin/time -v python memoire.py

sleep 60

If you have a series of instances, use the array option to submit a single task with sbatch, grouping all the instances you want to solve.

In this example, the total number of generated instances is 8: - n = 2 values - dataset = 2 values - country = 2 values

2 * 2 * 2 = 8.

This is why we use --array=1-8 in the header. It is crucial that these numbers match so that all instances are launched.

#!/bin/bash
#SBATCH --cpus-per-task=1
#SBATCH --mem=10G
#SBATCH --time=10:00
#SBATCH --array=1-8
#SBATCH --output=arrayjob_%A_%a.out

source ~/.bashrc

# Define parameters in arrays
n_values=(1 2)
datasets=("ab" "bd")
countries=("canada" "france")

# These loops generate all possible combinations
# based on the specified parameters.
# However, only the task corresponding to
# the index $SLURM_ARRAY_TASK_ID provided by Slurm should be executed.
# This is why we use the index $i for comparison.

i=0
for n in "${n_values[@]}"; do
    for dataset in "${datasets[@]}"; do
        for country in "${countries[@]}"; do
            # If this test or equivalent is not present, all
            # tasks will be executed for each array task.
            if [ "$SLURM_ARRAY_TASK_ID" -eq "$i" ]; then
                ./array.exe "$n" "$dataset" "$country"
                exit 0
            fi
            ((i++))
        done
    done
done

sleep 60

To run a MATLAB program on the compute grid, ensure your program is not in graphical mode and can obtain all required parameters without interaction or code modification.

#!/bin/bash
#SBATCH --cpus-per-task=1
#SBATCH --mem=16G
#SBATCH --time=10:00
#SBATCH --output=hello.out
source ~/.bashrc
module load matlab
matlab -batch hello
sleep 60

Task Cancellation¶

The scancel program is used to cancel one or more tasks. Example:

scancel 1234
scancel -u username

Task Monitoring and Statistics¶

Current Task Status¶

The squeue program is used to view tasks in the system. By default, if no options are provided, you will see everyone's tasks. You can use the -u option to specify your username. You can use the SQUEUE_FORMAT variable to change the default display of the command.

For example, the Alliance uses:

export SQUEUE_FORMAT="%.15i %.8u %.12a %.14j %.3t %.10L %.5D %.4C %.10b %.7m %N (%r)"

Examples¶

Command	Description
`squeue -u username`	Displays all tasks for the user
`squeue -u username -t RUNNING`	Displays the user's tasks that are running
`squeue -u username -t PENDING`	Displays the user's tasks that are pending

Task States¶

Code	State	Description
`CA`	Canceled	The task was canceled
`CD`	Complete	The task is completed
`F`	Failed	The task exited with a non-zero exit code
`PD`	Pending	The task is waiting
`R`	Running	The task is running

Detailed Task Information¶

The sstat and sacct commands can be used to obtain more information about tasks. The sstat command can only be used for running tasks, while sacct can also be used for completed tasks.

The available fields can be displayed with the -e option and then used with the --format option.

To view older tasks, you can specify a start date:

sacct --starttime 2022-04-17 --format=Account,User,JobID,Start,End,AllocCPUS,Elapsed,AllocTRES%30,CPUTime,AveRSS,MaxRSS,MaxRSSTask,MaxRSSNode,NodeList,ExitCode,State%20

The SACCT_FORMAT environment variable allows you to define a format if you do not want to specify it each time you run the program. The following example is the format used by the Alliance:

export SACCT_FORMAT=Account,User,JobID,Start,End,AllocCPUS,Elapsed,AllocTRES%30,CPUTime,AveRSS,MaxRSS,MaxRSSTask,MaxRSSNode,NodeList,ExitCode,State%20

The seff program can be used to view task execution statistics, such as CPU and memory usage percentages.

$ ▶ seff 12345
Job ID: 12345
Cluster: cluster
User/Group:
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 6
CPU Utilized: 07:31:57
CPU Efficiency: 22.34% of 1-09:42:54 core-walltime
Job Wall-clock time: 05:37:09
Memory Utilized: 49.33 GB
Memory Efficiency: 49.33% of 100.00 GB

This command allows you to see the utilization of requested resources and whether they are optimal. In this case, the memory used is 50% less than the memory requested by the user. The same applies to CPU usage, which is 23% of the requested CPUs for this task.

The reportseff program allows you to view statistics for multiple tasks at once.

In the example above, some tasks are very short, which may indicate that the program did not function correctly. The red numbers draw attention to tasks where resources are poorly utilized.

You can find more information about the reportseff program on its software page.

The seff-array program allows you to analyze tasks from the same array. This allows you to see the distribution of tasks for different levels of resource usage in histogram form.

$ ▶ seff-array 12345
--------------------------------------------------------
Job Information
ID: 12345
Name: test_models.sh
Cluster: cluster
User/Group: -----
Requested CPUs: 6 cores on 1 node(s)
Requested Memory: 100G
Requested Time: 14:00:00
--------------------------------------------------------
Job Status
COMPLETED: 28
--------------------------------------------------------
--------------------------------------------------------
Finished Job Statistics
(excludes pending, running, and cancelled jobs)
Average CPU Efficiency 37.26%
Average Memory Usage 26.38G
Average Run-time 18682.25s
---------------------
CPU Efficiency (%)
---------------------
+0.00e+00 - +1.00e+01  [0]
+1.00e+01 - +2.00e+01  [7]  ████████████████████████████████████████
+2.00e+01 - +3.00e+01  [5]  ████████████████████████████▋
+3.00e+01 - +4.00e+01  [3]  █████████████████▏
+4.00e+01 - +5.00e+01  [3]  █████████████████▏
+5.00e+01 - +6.00e+01  [7]  ████████████████████████████████████████
+6.00e+01 - +7.00e+01  [2]  ███████████▍
+7.00e+01 - +8.00e+01  [1]  █████▊
+8.00e+01 - +9.00e+01  [0]
+9.00e+01 - +1.00e+02  [0]
Memory Efficiency (%)
---------------------
+0.00e+00 - +1.00e+01  [3]  █████████████▍
+1.00e+01 - +2.00e+01  [7]  ███████████████████████████████▏
+2.00e+01 - +3.00e+01  [9]  ████████████████████████████████████████
+3.00e+01 - +4.00e+01  [4]  █████████████████▊
+4.00e+01 - +5.00e+01  [4]  █████████████████▊
+5.00e+01 - +6.00e+01  [0]
+6.00e+01 - +7.00e+01  [0]
+7.00e+01 - +8.00e+01  [1]  ████▌
+8.00e+01 - +9.00e+01  [0]
+9.00e+01 - +1.00e+02  [0]
Time Efficiency (%)
---------------------
+0.00e+00 - +1.00e+01  [ 2]  ████▎
+1.00e+01 - +2.00e+01  [ 0]
+2.00e+01 - +3.00e+01  [ 0]
+3.00e+01 - +4.00e+01  [19]  ████████████████████████████████████████
+4.00e+01 - +5.00e+01  [ 7]  ██████████████▊
+5.00e+01 - +6.00e+01  [ 0]
+6.00e+01 - +7.00e+01  [ 0]
+7.00e+01 - +8.00e+01  [ 0]
+8.00e+01 - +9.00e+01  [ 0]
+9.00e+01 - +1.00e+02  [ 0]
--------------------------------------------------------

Temporary Workspace¶

A temporary workspace is available in the /scratch directory on each of the optimum machines. This space is also accessible from the slurm frontend machine, allowing you to copy your data before running tasks.

Note: There is no backup for files stored in this directory, so do not place any files there that you cannot afford to lose.

We ask that you use this space reasonably.