1 - Connecting to DAIC

How to connect to DAIC?

SSH access

If you have a valid DAIC account (see Access and accounts), you can access DAIC resources using an SSH client. SSH (Secure SHell) is a protocol that allows you to connect to a remote computer via a secure network connection. SSH supports remote command-line login and remote command execution. SCP (Secure CoPy) and SFTP (Secure File Transfer Protocol) are file transfer protocols based on SSH (see wikipedia's ssh page ).

Connecting to DAIC from inside and outside TU Delft network

Connecting to DAIC from inside and outside TU Delft network

Access from the TU Delft Network

To connect to DAIC within TU Delft network (ie, via eduram or wired connection), open a command-line interface (prompt, or terminal, see Wikipedia's CLI page ), and run the following command:

$ ssh <YourNetID>@login.daic.tudelft.nl # Or
$ ssh login.daic.tudelft.nl             # If your username matches your NetID 

<YourNetID> is your TU Delft NetID. If the username on your machine you are connecting from matches your NetID, you can omit the square brackets and their contents, [<YourNetID>@].

This will log you in into DAIC’s login1.daic.tudelft.nl node for now. Note that this setup might change in the future as the system undergoes migration, potentially reducing the number of login nodes..

SHA256:MURg8IQL8oG5o2KsUwx1nXXgCJmDwHbttCJ9ljC9bFM
SHA256:MURg8IQL8oG5o2KsUwx1nXXgCJmDwHbttCJ9ljC9bFM
SHA256:O3AjQQjCfcrwJQ4Ix4dyGaUoYiIv/U+isMT5+sfeA5Q

Once identity confirmed, enter your password when prompted (nothing will be printed as you type your password):

The HPC cluster is restricted to authorized users only.
YourNetID@login.daic.tudelft.nl's password:

Next, a welcome message will be shown:

Last login: Mon Jul 24 18:36:23 2023 from tud262823.ws.tudelft.net
 #########################################################################
 #                                                                       #
 # Welcome to login1, login server of the HPC cluster.                   #
 #                                                                       #
 # By using this cluster you agree to the terms and conditions.          #
 #                                                                       #
 # For information about using the HPC cluster, see:                     #
 # https://login.hpc.tudelft.nl/                                         #
 #                                                                       #
 # The bulk, group and project shares are available under /tudelft.net/, #
 # your windows home share is available under /winhome/$USER/.           #
 #                                                                       #
 #########################################################################
 18:40:16 up 51 days,  6:53,  9 users,  load average: 0,82, 0,36, 0,53

And, now you can now verify your environment with basic commands:

YourNetID@login1:~$ hostname  # show the current hostname
login1.hpc.tudelft.nl
YourNetID@login1:~$ echo $HOME  # show the path to your home directory
/home/nfs/YourNetID
YourNetID@login1:~$ pwd  # show current path
/home/nfs/YourNetID
YourNetID@login1:~$ exit  # exit current connection
logout
Connection to login.daic.tudelft.nl closed.

In this example, the user, YourNetID, is logged in via the login node login1.hpc.tudelft.nl as can be seen from the hostname output. The user has landed in the $HOME directory, as can be seen by printing its value, and checked by the pwd command. Finally, the exit command is used to exit the cluster.

Access from outside university network

Direct access to DAIC from outside the university network is blocked by a firewall. To access DAIC, you have two options:

1. Using the Linux Bastion Server

To connect to DAIC via the Linux Bastion Server:

  1. SSH into the bastion server. The bastion server acts as a gateway to the DAIC cluster.

    • If you are an employee or guest, use linux-bastion.tudelft.nl.
    • If you are a student (BSc or MSc) use student-linux.tudelft.nl.
    ssh <YourNetID>@linux-bastion.tudelft.nl #OR
    ssh linux-bastion.tudelft.nl             # If your username matches your NetID
    

    As with DAIC login nodes, the first time you attempt to login to the bastion, you will be asked to confirm the server’s identity. Upon confirmation and entering your password, a welcome screen will be shown:

    The authenticity of host 'linux-bastion.tudelft.nl (131.180.123.195)' can't be established.
    ED25519 key fingerprint is SHA256:VJUFsQkIebODETsXwczkInnRrpdYYqAZDbsoKP1we+A.
    This key is not known by any other names                                                                     
    Are you sure you want to continue connecting (yes/no/[fingerprint])? yes 
    Warning: Permanently added 'linux-bastion.tudelft.nl' (ED25519) to the list of known hosts.
    YourNetID@linux-bastion.tudelft.nl's password:                                                                
                    ____  ____ _____                                         
     ___ _ ____   _|___ \|___ \___  |                                                                            
    / __| '__\ \ / / __) | __) | / /                                                                             
    \__ \ |   \ V / / __/ / __/ / /                                                                              
    |___/_|    \_/ |_____|_____/_/                                                                               
    
    YourNetID@srv227:~$
    
  2. Once on the bastion server, SSH into DAIC as shown in SSH access.

    YourNetID@srv227:~$ ssh login.daic.tudelft.nl # Or any other login node
    

2. Using a VPN

You can also use TU Delft’s EduVPN or OpenVPN (See TU Delft’s Access via VPN recommendations ) to access DAIC directly. Once connected to the VPN, you can ssh to DAIC directly, as in Access from the TU Delft Network.

Simplifying SSH with Configuration Files

To simplify SSH connections, you can store configurations in a file in your local machine. The SSH configuration file can be created (or found, if already exists) in ~/.ssh/config on Linux/Mac systems, or in C:\Users\<YourUserName>\.ssh on Windows.

For example, on a Linux system, you can have the following lines in the configuration file:

~/.ssh/config

Host daic
  HostName login.daic.tudelft.nl # Or any other login node
  User <YourNetID>
Host bastion
  Hostname linux-bastion.tudelft.nl # If employee/guest. Else, use: student-linux.tudelft.nl instead
  User <YourNetID>
  PreferredAuthentications password

where:

  • The Host keyword starts the SSH configuration block and specifies the name (or pattern of names, like daic in this example) to which the configuration entries will apply.
  • The HostName is the actual hostname to log into. Numeric IP addresses are also permitted (both on the command line and in HostName specifications).
  • The User is the login username. This is especially important when the username differs between your machine and the remote server/cluster.

You can then connect to DAIC from inside TU Delft network by just typing the following command:

$ ssh daic

Or, if outside the university network, you can connect via the bastion server:

$ ssh bastion

And, similarly, you can create/modify the configuration file on the bastion server (in ~/ssh/config) by adding a Host configuration block for DAIC as above, to simplify the connection to DAIC from there.

ssh proxy jump feature

To connect directly from your machine to a DAIC login node (when outside the university network), use the ssh Jump Host option to jump the bastion server as follows:

$ ssh -J YourNetID@linux-bastion.tudelft.nl YourNetID@login.daic.tudelft.nl # use `student-linux.tudelft.nl` instead if you are a student

For convenience, you can also edit your ssh configuration file, ~/.ssh/config, on your local computer as follows:

Host daic
  Hostname login.daic.tudelft.nl
  User <YourNetID>
  ProxyJump linux-bastion.tudelft.nl # For employees and guests. If you are a student, use: student-linux.tudelft.nl instead

Where:

*ProxyJump: Specifies the jump server, bastion in this case.

You can then simply use: ssh daic to login.

Efficient SSH Connections with SSH Multiplexing

SSH multiplexing allows you to reuse an existing connection for multiple SSH sessions, reducing the time spent entering your password for every new connection. After the first connection is established, subsequent connections will be much faster since the existing control connection is reused.

To enable SSH multiplexing, add the following lines to your SSH configuration file. Assuming a Linux/Mac system, you can add the following lines to ~/.ssh/config:

~/.ssh/config

Host *
  ControlMaster auto
  ControlPath /tmp/ssh-%r@%h:%p

where:

  • The ControlPath specifies where to store the “control socket” for the multiplexed connections. %r refers to the remote login name, %h refers to the target host name, and %p refers to the destination port. This ensures that SSH separates control sockets for different connections.
  • The ControlMaster setting activates multiplexing. With the auto setting, SSH will use an existing master connection if available or create a new one when necessary. This configuration helps streamline SSH connections and reduces the need to enter your password for each new session.

This setup will speed up connections after the first one and reduce the need to repeatedly enter your password for each new SSH session.

2 - Data management & transfer

How and where to store data on DAIC.

Data Management Guidelines

There are different use cases and quota limits for the different TU Delft network drives. For example, Umbrella (project storage), is for everybody and everything, while bulk needs to be cleaned up, migrated and phased out. Always check TU Delft Overview data storage for guidelines on using network drives and quota limits.

Data transfer

Your Windows Personal Storage and the Project and Group Storage are available on all TU Delft installed machines including the DAIC compute servers. If possible use one of these for files that you want to access on both your personal computer and the compute servers. Your Windows Personal Storage and the Project and Group Storage are also accessible off-campus through the TU Delft webdata service. See the webdata page for manuals on using the service with your personal computer.

SCP

Both your Linux and Windows Personal Storage and the Project and Group Storage are also available world-wide via an SCP/SFTP client. This is the simplest transfer method via the scp command, which has the following basic syntax:

$ scp <source_file> <target_destination>       # for files
$ scp -r <source_folder> <target_destination>  # for folders

For example, to transfer a file from your computer to DAIC:

$ scp mylocalfile [<netid>@]login.daic.tudelft.nl:~/destination_path_on_DAIC/

To transfer a folder (recursively) from your computer to DAIC:

$ scp -r mylocalfolder [<netid>@]login.daic.tudelft.nl:~/destination_path_on_DAIC/

To transfer a file from DAIC to your computer:

$ scp [<netid>@]login.daic.tudelft.nl:~/origin_path_on_DAIC/remotefile ./

To transfer a folder from DAIC to your computer:

$ scp -r [<netid>@]login.daic.tudelft.nl:~/origin_path_on_DAIC/remotefolder ./

The above commands will work from either the university network, or when using EduVPN. If a “jump” via linux-bastion is needed (see Access from outside university network), modify the above commands by replacing scp with scp -J <netid>@linux-bastion.tudelft.nl and keep the rest of the command as before:

$ scp <local_file> [<netid>@]linux-bastion.tudelft.nl:<remote_destination>
$ scp -r <local_folder> [<netid>@]linux-bastion.tudelft.nl:<remote_destination>
$ scp [<netid>@]linux-bastion.tudelft.nl:<remote_file> <local_destination> 
$ scp -r [<netid>@]linux-bastion.tudelft.nl:<remote_folder> <local_destination>

$ sftp [<netid>@]linux-bastion.tudelft.nl

Where:

  • Case is important.
  • Items between < > brackets are user-supplied values (so replace with your own NetID, file or folder name).
  • Items between [ ] brackets are optional: when your username on your local computer is the same as your NetID username, you don’t have to specify it.
  • When you specify your NetID username, don’t forget the @ character between the username and the computer name.

rsync

rsync is a robust file copying and synchronization tool commonly used in Unix-like operating systems. It allows you to transfer files and directories efficiently, both locally and remotely. rsync supports options that enable compression, preserve file attributes, and allow for incremental updates.

Basic Usage

  • Copy files locally:

    rsync [options] source destination
    

    This command copies files and directories from the source to the destination.

  • Copy files remotely:

    rsync [options] source user@remote_host:destination
    

    This command transfers files from a local source to a remote destination.

Examples

  • Synchronize a local directory with a remote directory:

    rsync -avz /path/to/local/dir user@remote_host:/path/to/remote/dir
    

    This synchronizes a local directory with a remote directory, using archive mode (-a) to preserve file attributes, verbose mode (-v) for detailed output, and compression (-z) for efficient transfer.

  • Synchronize a remote directory with a local directory:

    rsync -avz user@remote_host:/path/to/remote/dir /path/to/local/dir
    

    This transfers files from a remote directory to a local directory, using the same options as the previous example.

  • Delete files in the destination that are not present in the source:

    rsync -av --delete /path/to/source/dir /path/to/destination/dir
    

    This synchronizes the source and destination directories and deletes files in the destination that are not in the source.

  • Exclude certain files or directories during transfer:

    rsync -av --exclude='*.tmp' /path/to/source/dir /path/to/destination/dir
    

    This synchronizes the source and destination directories, excluding files with the .tmp extension.

Other Options in rsync

In addition to the commonly used options, rsync provides several other options for more advanced control and customization during file transfers:

  • --dry-run: Perform a trial run without making any changes. This option allows you to see what would be done without actually doing it.

  • --checksum: Use checksums instead of file size and modification time to determine if files should be transferred. This is more precise but slower.

  • --partial: Keep partially transferred files and resume them later. This is useful in case of an interrupted transfer.

  • --partial-dir=DIR: Specify a directory to hold partial transfers. This option works well with --partial.

  • --bwlimit=KBPS: Limit the bandwidth used by the transfer to the specified rate in kilobytes per second. Useful for managing network load.

  • --timeout=SECONDS: Set a maximum wait time in seconds for receiving data. If the timeout is exceeded, rsync will exit.

  • --no-implied-dirs: When transferring a directory, this option prevents the creation of implied directories on the destination side that exist in the source but not explicitly specified in the transfer.

  • --files-from=FILE: Read a list of source files from the specified FILE. This can be useful when you want to transfer specific files.

  • --update: Skip files that are newer on the destination than the source. This is useful for incremental backups.

  • --ignore-existing: Skip files that already exist on the destination. Useful when you want to avoid overwriting existing files.

  • --inplace: Update files in place instead of creating temporary files and renaming them later. This can save disk space and improve speed.

  • --append: Append data to files instead of replacing them if they already exist on the destination.

  • --append-verify: Append data and verify it with checksums to ensure integrity.

  • --backup: Make backups of files that are overwritten or deleted during the transfer. By default, a ~ is appended to the backup filename.

  • --backup-dir=DIR: Specify a directory to store backup files.

  • --suffix=SUFFIX: Specify a suffix to append to backup files instead of the default ~.

  • --progress: Displays the progress of the transfer, including the speed and the number of bytes transferred. This is useful for monitoring long transfers and seeing how much data has been copied so far.

These options, along with others, provide additional flexibility and control over your rsync transfers, allowing you to fine-tune the synchronization process to meet your specific needs.

3 - Software

How to set up your tools and/or run certain libraries.

3.1 - Available software

How to find and work with pre-installed software?

General software

Most common general software, like programming languages and libraries, is installed on the DAIC nodes. To check if the program that you need is pre-installed, you can simply try to start it:

$ python
Python 2.7.5 (default, Jun 28 2022, 15:30:04) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> quit()

To find out which binary is used exactly you can use which command:

$ which python
/usr/bin/python

Alternatively, you can try to locate the program or library using the whereis command:

$ whereis python
python: /usr/bin/python3.4m-config /usr/bin/python3.6m-x86_64-config /usr/bin/python2.7 /usr/bin/python3.6-config /usr/bin/python3.4m-x86_64-config /usr/bin/python3.6m-config /usr/bin/python3.4 /usr/bin/python3.4m /usr/bin/python2.7-config /usr/bin/python3.6 /usr/bin/python3.4-config /usr/bin/python /usr/bin/python3.6m /usr/lib/python2.7 /usr/lib/python3.4 /usr/lib/python3.6 /usr/lib64/python2.7 /usr/lib64/python3.4 /usr/lib64/python3.6 /etc/python /usr/include/python2.7 /usr/include/python3.4m /usr/include/python3.6m /usr/share/man/man1/python.1.gz

Or, you can check if the package is installed using the rpm -qa command as follows:

$ rpm -q python
python-2.7.5-94.el7_9.x86_64
$ rpm -q python4
package python4 is not installed

You can also search with wildcards:

$ rpm -qa 'python*'
python2-wheel-0.29.0-2.el7.noarch
python2-cryptography-1.7.2-2.el7.x86_64
python34-virtualenv-15.1.0-5.el7.noarch
python-networkx-1.8.1-12.el7.noarch
python-gobject-3.22.0-1.el7_4.1.x86_64
python-gofer-2.12.5-3.el7.noarch
python-iniparse-0.4-9.el7.noarch
python-lxml-3.2.1-4.el7.x86_64
python34-3.4.10-8.el7.x86_64
python36-numpy-f2py-1.12.1-3.el7.x86_64
...

Useful commands on DAIC

For a list of handy commands on DAIC have a look here.

3.2 - Modules

How to find and work with pre-installed software?

In the context of Unix-like operating systems, the module command is part of the environment modules system, a tool that provides a dynamic approach to managing the user environment. This system allows users to load and unload different software packages or environments on demand. Some often used third-party software (e.g., CUDA, cuDNN, MATLAB) is pre-installed on the cluster as environment modules .

Usage

To see or use the available modules, first, enable the software collection:

$ module use /opt/insy/modulefiles

Now, to see all available packages and versions:

$ module avail
---------------------------------------------------------------------------------------------- /opt/insy/modulefiles ----------------------------------------------------------------------------------------------
   albacore/2.2.7-Python-3.4        cuda/11.8                 cudnn/11.5-8.3.0.98        devtoolset/6    devtoolset/10        intel/oneapi  (D)    matlab/R2021b (D)    miniconda/3.9             (D)
   comsol/5.5                       cuda/12.0                 cudnn/12-8.9.1.23   (D)    devtoolset/7    devtoolset/11 (D)    intel/2017u4         miniconda/2.7        nccl/11.5-2.11.4
   comsol/5.6                (D)    cuda/12.1          (D)    cwp-su/43R8                devtoolset/8    diplib/3.2           matlab/R2020a        miniconda/3.7        openmpi/4.0.1
   cuda/11.5                        cudnn/11-8.6.0.163        cwp-su/44R1         (D)    devtoolset/9    :
   ...
  • D is a label for the default module in case multiple versions are available. E.g. module load cuda will load cuda/12.1
  • L means a module is currently loaded

To check the description of a specific module:

$ module whatis cudnn
cudnn/12-8.9.1.23   : cuDNN 8.9.1.23 for CUDA 12
cudnn/12-8.9.1.23   : NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks.

And to use the module or package, load it as follows:

$ module load cuda/11.2 cudnn/11.2-8.1.1.33 # load the module

$ module list                               # check the loaded modules

Currently Loaded Modules:
   1) cuda/11.2   2) cudnn/11.2-8.1.1.33

Compilers and Development Tools

The cluster provides several compilers and development tools. The following table lists the available compilers and development tools. These are available in the devtoolset module:

$ module use /opt/insy/modulefiles
$ module avail devtoolset

---------------------------------------------------------------------------------------------- /opt/insy/modulefiles ----------------------------------------------------------------------------------------------
   devtoolset/6    devtoolset/7    devtoolset/8    devtoolset/9    devtoolset/10    devtoolset/11 (L,D)

  Where:
   L:  Module is loaded
   D:  Default Module

If the avail list is too long consider trying:

"module --default avail" or "ml -d av" to just list the default modules.
"module overview" or "ml ov" to display the number of modules for each name.

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

$ module whatis devtoolset
devtoolset/11       : Developer Toolset 11 Software Collection
devtoolset/11       : GNU Compiler Collection, GNU Debugger, and other development, debugging, and performance monitoring tools.

$ module load devtoolset/11
$ gcc --version
gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

3.3 - Installing software

How to install unavailable software?

Basic principles

  • On a cluster, it’s important that software is available and identical on all nodes, both login and compute nodes (see Workload scheduler). For self-installed software, it’s easier to install the software in one shared location than installing and maintaining the same software separately on every single node. You should therefore install your software on one of the network shares (eg, your $HOME folder or an umbrella or bulk folder) that are accessible from all nodes (see Storage).

  • As a regular Linux user you don’t have administrator rights. Yet, you can do your normal work, including installing software in a personal folder, without needing administrator rights. Consequently, you don’t need (nor are you allowed) to use the sudo or su commands that are often shown in manuals.

  • DAIC provides only 8GB of storage in the /home directories and the project spaces (/tudelft.net/...) are Windows-based leading to problems installing packages with pip due to file permission errors. However, /tudelft.net/... locations are mounted on all nodes. Therefore, the recommened way of using your own software and environments is to use containerization and to store your containers under /tudelft.net/staff-umbrella/.... Check out the Apptainer tutorial for guidance.

Managing environments

Conda/Mamba

Conda and Mamba are both package management and environment management tools used primarily in the data science and programming communities. Conda, developed by Anaconda, Inc., allows users to manage packages and create isolated environments for different projects, supporting multiple languages like Python and R. Mamba is a more recent alternative to Conda that offers faster performance and improved dependency solving using the same package repositories as Conda. Both tools help avoid dependency conflicts and simplify the management of software packages and environments. You can install it with:

Use module load conda

Miniconda is available as module.

$ module use /opt/insy/modulefiles  # If not already
$ module load miniconda 
$ which conda 
/opt/insy/miniconda/3.9/bin/conda

Creating a conda environment

To create a new environment you can run conda create:

$ conda create -n env
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.10.1
  latest version: 24.3.0

Please update conda by running

    $ conda update -n base -c defaults conda

 ## Package Plan ##

  environment location: /home/nfs/username/.conda/envs/env

Creating a conde environment from a YAML file

Conda allows you to create environments from a YAML file that specifies the packages and their versions for the desired environment. This feature makes it easier to reproduce environments across different machines and share environment configurations with others.

$ conda env create -f environment.yml (-n new-name)

For how to create a environment.yml file see Exporting environments

Environment variables

You can set enviromnet variables to install packages and environments in other locations:

  • CONDA_PREFIX: This variable points to the active conda environment’s root directory. When an environment is active, CONDA_PREFIX contains the path to that environment’s root directory.

  • CONDA_ENVS_DIRS: This variable specifies the directories where conda environments are stored. You can set it to a list of directories (separated by colons on Unix-like systems and semicolons on Windows). Conda will search for and store environments in these directories.

  • CONDA_PKGS_DIRS: This variable specifies the directories where conda stores downloaded packages. Like CONDA_ENVS_DIRS, you can set it to a list of directories. Conda uses these directories as cache locations for package downloads and installations.

Examples

  • Set conda environments directory:
$ export CONDA_ENVS_DIRS="/tudelft.net/staff-umbrella/my-project/"

A caveat is that the /tudelft.net mounts are windows based and therefore have compatibility issues with pip. When you create your conda environments there you will not be able to use pip to install packages. It is therefore recommeneded to keep the conda environments minimal and in your home directory, and to use containerization for larger environments.

List existing environments

You can list environments with

$ conda env list

Activating environments

You can activate an existing environemnt with conda activate, for example to install more packages:

$ conda activate env  # Activate the newly created environment 

Modifying environments

Sometimes you need to add/remove/change packages and libraries in existing environments. First, activate the enviroment you want to change with conda activate and then run conda install package-name or conda remove package-name. You can also use pip to install packages inside a conda environment, but for that pip has to be installed inside the environment. To make sure pip is installed in your enviroment run conda install pip first.

(env) $ conda install pandas  # Add a new package to the active environment
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.10.1
  latest version: 24.3.0

Please update conda by running

    $ conda update -n base -c defaults conda

## Package Plan ##

  environment location: /home/nfs/sdrwacker/.conda/envs/test

  added / updated specs:
    - pandas

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    blas-1.0                   |              mkl           6 KB
    bottleneck-1.3.7           |  py312ha883a20_0         140 KB
    bzip2-1.0.8                |       h5eee18b_5         262 KB
    expat-2.6.2                |       h6a678d5_0         177 KB
    intel-openmp-2023.1.0      |   hdb19cb5_46306        17.2 MB
    ld_impl_linux-64-2.38      |       h1181459_1         654 KB
    libffi-3.4.4               |       h6a678d5_0         142 KB
    libuuid-1.41.5             |       h5eee18b_0          27 KB
    mkl-2023.1.0               |   h213fc3f_46344       171.5 MB
    mkl-service-2.4.0          |  py312h5eee18b_1          66 KB
    mkl_fft-1.3.8              |  py312h5eee18b_0         204 KB
    mkl_random-1.2.4           |  py312hdb19cb5_0         284 KB
    ncurses-6.4                |       h6a678d5_0         914 KB
    numexpr-2.8.7              |  py312hf827012_0         149 KB
    numpy-1.26.4               |  py312hc5e2394_0          11 KB
    numpy-base-1.26.4          |  py312h0da6c21_0         7.7 MB
    openssl-3.0.13             |       h7f8727e_0         5.2 MB
    pandas-2.2.1               |  py312h526ad5a_0        15.4 MB
    pip-23.3.1                 |  py312h06a4308_0         2.8 MB
    python-3.12.3              |       h996f2a0_0        34.8 MB
    pytz-2023.3.post1          |  py312h06a4308_0         197 KB
    readline-8.2               |       h5eee18b_0         357 KB
    setuptools-68.2.2          |  py312h06a4308_0         1.2 MB
    six-1.16.0                 |     pyhd3eb1b0_1          18 KB
    sqlite-3.41.2              |       h5eee18b_0         1.2 MB
    tbb-2021.8.0               |       hdb19cb5_0         1.6 MB
    tk-8.6.12                  |       h1ccaba5_0         3.0 MB
    tzdata-2024a               |       h04d1e81_0         116 KB
    wheel-0.41.2               |  py312h06a4308_0         131 KB
    xz-5.4.6                   |       h5eee18b_0         651 KB
    zlib-1.2.13                |       h5eee18b_0         103 KB
    ------------------------------------------------------------
                                           Total:       266.1 MB

The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main
  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu
  blas               pkgs/main/linux-64::blas-1.0-mkl
  bottleneck         pkgs/main/linux-64::bottleneck-1.3.7-py312ha883a20_0
  bzip2              pkgs/main/linux-64::bzip2-1.0.8-h5eee18b_5
  ca-certificates    pkgs/main/linux-64::ca-certificates-2024.3.11-h06a4308_0
  expat              pkgs/main/linux-64::expat-2.6.2-h6a678d5_0
  intel-openmp       pkgs/main/linux-64::intel-openmp-2023.1.0-hdb19cb5_46306
  ld_impl_linux-64   pkgs/main/linux-64::ld_impl_linux-64-2.38-h1181459_1
  libffi             pkgs/main/linux-64::libffi-3.4.4-h6a678d5_0
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1
  libgomp            pkgs/main/linux-64::libgomp-11.2.0-h1234567_1
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1
  libuuid            pkgs/main/linux-64::libuuid-1.41.5-h5eee18b_0
  mkl                pkgs/main/linux-64::mkl-2023.1.0-h213fc3f_46344
  mkl-service        pkgs/main/linux-64::mkl-service-2.4.0-py312h5eee18b_1
  mkl_fft            pkgs/main/linux-64::mkl_fft-1.3.8-py312h5eee18b_0
  mkl_random         pkgs/main/linux-64::mkl_random-1.2.4-py312hdb19cb5_0
  ncurses            pkgs/main/linux-64::ncurses-6.4-h6a678d5_0
  numexpr            pkgs/main/linux-64::numexpr-2.8.7-py312hf827012_0
  numpy              pkgs/main/linux-64::numpy-1.26.4-py312hc5e2394_0
  numpy-base         pkgs/main/linux-64::numpy-base-1.26.4-py312h0da6c21_0
  openssl            pkgs/main/linux-64::openssl-3.0.13-h7f8727e_0
  pandas             pkgs/main/linux-64::pandas-2.2.1-py312h526ad5a_0
  pip                pkgs/main/linux-64::pip-23.3.1-py312h06a4308_0
  python             pkgs/main/linux-64::python-3.12.3-h996f2a0_0
  python-dateutil    pkgs/main/noarch::python-dateutil-2.8.2-pyhd3eb1b0_0
  python-tzdata      pkgs/main/noarch::python-tzdata-2023.3-pyhd3eb1b0_0
  pytz               pkgs/main/linux-64::pytz-2023.3.post1-py312h06a4308_0
  readline           pkgs/main/linux-64::readline-8.2-h5eee18b_0
  setuptools         pkgs/main/linux-64::setuptools-68.2.2-py312h06a4308_0
  six                pkgs/main/noarch::six-1.16.0-pyhd3eb1b0_1
  sqlite             pkgs/main/linux-64::sqlite-3.41.2-h5eee18b_0
  tbb                pkgs/main/linux-64::tbb-2021.8.0-hdb19cb5_0
  tk                 pkgs/main/linux-64::tk-8.6.12-h1ccaba5_0
  tzdata             pkgs/main/noarch::tzdata-2024a-h04d1e81_0
  wheel              pkgs/main/linux-64::wheel-0.41.2-py312h06a4308_0
  xz                 pkgs/main/linux-64::xz-5.4.6-h5eee18b_0
  zlib               pkgs/main/linux-64::zlib-1.2.13-h5eee18b_0


Proceed ([y]/n)? y
....

Exporting environments

You can export versions of all installed packages and libaries inside a coda environment with conda env export. It is good practice to keep track of all versions that you have used for a particular experiment by exporting it into a YAML file typically called environment.yml:

$ conda env export --no-builds > environment.yml

Install your own mamba/conda

Sometimes the versions provided by module are outdated and users need their own installation of conda or mamba. A minimal version can be installed as demonstrated in the following:

$ alias install-miniforge='
    wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh \
    && bash Miniforge3-Linux-x86_64.sh -b \
    && rm -f Miniforge3-Linux-x86_64.sh \
    && eval "$($HOME/miniforge3/bin/conda shell.bash hook)" \
    && conda init \
    && conda install -n base -c conda-forge mamba'

$ cd ~ && install-miniforge

(base) $  # This shows that the 'base' environment is active.
(base) $ which python
~/miniforge3/bin/python

This will already occupy around 500MB of your home directory totalling ~20k files.

$ du -h miniforge3 --max-depth=0
486M	miniforge3

$ find miniforge3 -type f | wc -l
20719

Now, you can install your own versions of libraries and programs, or create entire environments as descibed above.

Using binaries

Some programs come as precompiled binaries or are written in a scripting language such as Perl, PHP, Python or shell script. Most of these programs don’t actually need to be “installed” since you can simply run these programs directly. In certain scenarios, you may need to make the program executable first using chmod +x:

$ ./my-executable        # attempting to run the binary `my-executable`
-bash: ./my-executable: Permission denied

$ chmod +x program       # making `my-executable` executable, since it fails due to permissions

$ ./my-executable        # checking `my-executable` works!
Hello world!

Installing from source

When a pre-made binary of your software is not available, you’ll have to install the software yourself from the source. You may need to set up your Installation environment before following this Installation recipe.

Installation environment

When you are installing software for the very first time, you need to set up your environment. If you have already done this before , you can skip this section and go directly to the Installation recipe section.

To set up your environment, first, add the following lines to your ~/.bash_profile or, alternatively, download this (bash_profile.txt) as shown in the subsequent commands:

bash_profile.txt

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
   . ~/.bashrc
fi

# User specific environment and startup settings
export PREFIX="$HOME/.local"
export ACLOCAL_PATH="$PREFIX/share/aclocal${ACLOCAL_PATH:+:$ACLOCAL_PATH}"
export CPATH="$PREFIX/include${CPATH:+:$CPATH}"
export LD_LIBRARY_PATH="$PREFIX/lib64:$PREFIX/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
export LIBRARY_PATH="$PREFIX/lib64:$PREFIX/lib${LIBRARY_PATH:+:$LIBRARY_PATH}"
export MANPATH="$PREFIX/share/man${MANPATH:+:$MANPATH}"
export PATH="$HOME/bin:$PREFIX/bin:$PATH"
export PERL5LIB="$PREFIX/lib64/perl5:$PREFIX/share/perl5${PERL5LIB:+:$PERL5LIB}"
export PKG_CONFIG_PATH="$PREFIX/lib64/pkgconfig:$PREFIX/share/pkgconfig${PKG_CONFIG_PATH:+:$PKG_CONFIG_PATH}"
export PYTHONPATH="$PREFIX/lib/python2.7/site-packages${PYTHONPATH:+:$PYTHONPATH}"

$ cp ~/.bash_profile ~/.bash_profile.bak # back up your file
$ curl -s https://wiki.tudelft.nl/pub/Research/InsyCluster/InstallingSoftware/bash_profile.txt >> ~/.bash_profile # download and append the lines above

Then, clean up any duplicate settings, and:

$ source ~/.bash_profile
$ mkdir -p "$PREFIX"

The line export PREFIX="$HOME/.local" sets your software installation directory to /home/nfs/<YourNetID>/.local (which is the default and accessible on all nodes). This is in your personal home directory where you have a space quota of 8GB. However, for software for your research project, you should instead use a project share, for example:

export PREFIX="/tudelft.net/staff-umbrella/project/software"

The other variables will let you use your self-installed programs. You are now ready to install your software!

Installation recipe

Software installation usually just requires you to follow the general installation recipe described below, but you always need to consult the documentation for your software.

  1. Place the source of the software in a folder under /tmp:
$ mkdir /tmp/$USER
$ cd /tmp/$USER

You can sometimes download the software directly from the internet:

$ wget http://host/path/software.tar.gz
$ tar -xzf software.tar.gz

Or, clone the software from a git repository:

$ git clone https://github.com/software

Then:

$ cd software
  1. If the software provides a configure script, run it:
$ ./configure --prefix="$PREFIX" 

If configure complains about missing software, you’ll either have to install that software, tell configure where it is (--with-feature _path_=) or disable the feature (--disable-feature).

If your software provides a CMakeLists.txt file, run cmake (note: the trailing two dots on the last line are needed exactly as shown):

$ mkdir -p build $ cd build $ cmake -DCMAKE_INSTALL_PREFIX="$PREFIX" .. 

Again, if cmake complains about missing software, you’ll either have to install that software or tell cmake where it is (-DCMAKE_SYSTEM_PREFIX_PATH="/usr/local;/usr;$PREFIX;path").

If neither is provided, consult the documentation for dependencies and configuration (specifically for the installation directory).

There is no point in continuing until all reported problems have been fixed.

  1. Compile the software:
$ make 

If compilation is aborted due to an error, Google the error for possible solutions. Again, there is no point in continuing until all reported problems have been fixed.

  1. Install the software. When you used configure or cmake, you can simply run:
$ make install 

When you used neither, you need to use:

$ make prefix="$PREFIX" install 
  1. Your software should now be ready to use, so check it:
$ cd $ _program_ 
  1. When the program works, clean up /tmp/netid:
$ rm -r /tmp/$USER 

3.4 - Containerization

How to use Apptainer on DAIC?

Apptainer

Apptainer is a container platform. It allows you to create and run containers that package up pieces of software in a way that is portable and reproducible. You can build a container using Apptainer on your laptop, and then run it on many on an HPC cluster. Apptainer was created to run complex applications on HPC clusters in a simple, portable, and reproducible way. This repository contains a template for building a Apptainer (former Singularity) container using miniforge, and mamba (similar to conda). The examples directory also contains examples for other setups.

Apptainer features

  • Verifiable reproducibility and security, using cryptographic signatures, an immutable container image format, and in-memory decryption.
  • Integration over isolation by default. Easily make use of GPUs, high speed networks, parallel filesystems on a cluster or server by default.
  • Mobility of compute. The single file SIF container format is easy to transport and share.
  • A simple, effective security model. You are the same user inside a container as outside, and cannot gain additional privilege on the host system by default. Read more about Security in Apptainer.

Template

The Apptainer template repository maintained by the Research Engineering and Infrastructure Team is a good starting point to create your own apptainers.

How to use Apptainer on the cluster with SLURM?

Here is an example how to use the container in a SLURM script.

#!/bin/sh
#SBATCH --job-name="apptainer-job"
#SBATCH --account="my-account"
#SBATCH --partition="general"      # Request partition.
#SBATCH --time=01:00:00            # Request run time (wall-clock). Default is 1 minute
#SBATCH --nodes=1.                 # Request 1 node
#SBATCH --tasks-per-node=1         # Set one task per node
#SBATCH --cpus-per-task=4          # Request number of CPUs (threads) per task.
#SBATCH --gres=gpu:1               # Request 1 GPU
#SBATCH --mem=4GB                  # Request 4 GB of RAM in total
#SBATCH --mail-type=END            # Set mail type to 'END' to receive a mail when the job finishes. 
#SBATCH --output=slurm-%x-%j.out   # Set name of output log. %j is the Slurm jobId
#SBATCH --error=slurm-%x-%j.err    # Set name of error log. %j is the Slurm jobId

export APPTAINER_ROOT="/path/to/container/folder"
export APPTAINER_NAME="my-container.sif"

# If you use GPUs
module use /opt/insy/modulefiles
module load cuda/12.1

# Run script
srun apptainer exec \
  --nv \                              # Bind NVIDIA libraries from the host
  --env-file ~/.env \                 # Source additional environment variables (optional)
  -B /home/$USER:/home/$USER \        # Mount host file-sytem inside container 
  -B /tudelft.net/:/tudelft.net/ \    # (different for each cluster)
  $APPTAINER_ROOT/$APPTAINER_NAME \   # Path to the container to run
  python script.py                    # Command to be executed inside container

Tutorial

See the Apptainer tutorial.

4 - Job submission

How to submit jobs to slurm?

Slurm job’s terminology: job, job step, task and CPUs

A slurm job (submitted via sbatch) can consists of multiple steps in series. Each step (specified via srun) can run multiple tasks (ie programs) in parallel. Each task gets its own set of CPUs. As an example, consider the workflow and corresponding breakdown shown in fig 2.

Slurm job’s terminology

Slurm job’s terminology

In this example, note:

  • When you explicitly request 1 CPU per task (--cpus-per-task=1), you should also explicitly specify the number of tasks (--ntasks). Otherwise, srun may start the task twice in parallel (because CPUs are allocated in multiples of 2)
  • The default slurm allocation is a single task and single CPU (ie --ntasks=1 --cpus-per-task=1). Thus, it is not necessary to explicitly request these to run a single task on a single CPU.
  • When using multiple tasks, specify --mem-per-cpu.

4.1 - Basics of Slurm jobs

How to submit jobs to the cluster?

Job script

Job scripts are text files, where the header set of directives that specify compute resources, and the remainder is the code that needs to run. All resources and scheduling are specified in the header as #SBATCH directives (see man sbatch for more information). Code could be a set of steps to run in series, or parallel tasks within these steps (see Slurm job’s terminology).

The code snippet below is a template script that can be customized to run jobs on DAIC. A useful tool that can be used to streamline the debugging of such scripts is ShellCheck .

jobscript.sbatch

#!/bin/sh
#SBATCH --partition=general # Request partition. Default is 'general' 
#SBATCH --qos=short         # Request Quality of Service. Default is 'short' (maximum run time: 4 hours)
#SBATCH --time=0:01:00      # Request run time (wall-clock). Default is 1 minute
#SBATCH --ntasks=1          # Request number of parallel tasks per job. Default is 1
#SBATCH --cpus-per-task=2   # Request number of CPUs (threads) per task. Default is 1 (note: CPUs are always allocated to jobs per 2).
#SBATCH --mem=1024          # Request memory (MB) per node. Default is 1024MB (1GB). For multiple tasks, specify --mem-per-cpu instead
#SBATCH --mail-type=END     # Set mail type to 'END' to receive a mail when the job finishes. 
#SBATCH --output=slurm_%j.out # Set name of output log. %j is the Slurm jobId
#SBATCH --error=slurm_%j.err # Set name of error log. %j is the Slurm jobId

/usr/bin/scontrol show job -d "$SLURM_JOB_ID"  # check sbatch directives are working

#Remaining job commands go below here. For example, to run a Matlab script named "matlab_script.m", uncomment:
#module use /opt/insy/modulefiles # Use DAIC INSY software collection
#module load matlab/R2020b        # Load Matlab 2020b version
#srun matlab < matlab_script.m # Computations should be started with 'srun'.

Job submission

To submit a job script jobscript.sbatch, login to DAIC, and:

  • To only test:
$ sbatch --test-only jobscript.sbatch
Job 1 to start at 2015-06-30T14:00:00 using 2 processors on nodes insy15 in partition general
  • To actually submit the job and do the computations:
$ sbatch jobscript.sbatch
Submitted batch job 2

Using GPU resources

Some DAIC nodes have GPUs of different types, that can be used for various compute purposes (see GPUs).

To request a gpu for a job, use the sbatch directive --gres=gpu[:type][:number], where the optional [:type] and [:number] specify the type and number of the GPUs requested, as in the examples below:

Slurm directives to request gpus for a job

Slurm directives to request gpus for a job

An example batch script with GPU resources

#!/bin/sh
#SBATCH --partition=general # Request partition. Default is 'general' 
#SBATCH --qos=short         # Request Quality of Service. Default is 'short' (maximum run time: 4 hours)
#SBATCH --time=0:01:00      # Request run time (wall-clock). Default is 1 minute
#SBATCH --ntasks=1          # Request number of parallel tasks per job. Default is 1
#SBATCH --cpus-per-task=2   # Request number of CPUs (threads) per task. Default is 1 (note: CPUs are always allocated to jobs per 2).
#SBATCH --mem=1024          # Request memory (MB) per node. Default is 1024MB (1GB). For multiple tasks, specify --mem-per-cpu instead
#SBATCH --mail-type=END     # Set mail type to 'END' to receive a mail when the job finishes. 
#SBATCH --output=slurm_%j.out # Set name of output log. %j is the Slurm jobId
#SBATCH --error=slurm_%j.err # Set name of error log. %j is the Slurm jobId

#SBATCH --gres=gpu:1 # Request 1 GPU

# Measure GPU usage of your job (initialization)
previous=$(/usr/bin/nvidia-smi --query-accounted-apps='gpu_utilization,mem_utilization,max_memory_usage,time' --format='csv' | /usr/bin/tail -n '+2') 

/usr/bin/nvidia-smi # Check sbatch settings are working (it should show the GPU that you requested)

# Remaining job commands go below here. For example, to run python code that makes use of GPU resources:

# Uncomment these lines and adapt them to load the software that your job requires
#module use /opt/insy/modulefiles          # Use DAIC INSY software collection
#module load cuda/11.2 cudnn/11.2-8.1.1.33 # Load certain versions of cuda and cudnn 
#srun python my_program.py # Computations should be started with 'srun'. For example:

# Measure GPU usage of your job (result)
/usr/bin/nvidia-smi --query-accounted-apps='gpu_utilization,mem_utilization,max_memory_usage,time' --format='csv' | /usr/bin/grep -v -F "$previous"

Similarly, to interactively work in a GPU node:

$ hostname # check you are in one of the login nodes
login1.daic.tudelft.nl
$
$ sinteractive --cpus-per-task=1 --mem=500 --time=00:01:00 --gres=gpu:v100:1
Note: interactive sessions are automatically terminated when they reach their time limit (1 hour)!
srun: job 8607665 queued and waiting for resources
srun: job 8607665 has been allocated resources
 15:27:18 up 51 days,  3:04,  0 users,  load average: 62,09, 59,43, 44,04
SomeNetID@insy11:~$
SomeNetID@insy11:~$ hostname # check you are in one of the compute nodes
insy11.daic.tudelft.nl
SomeNetID@insy11:~$
SomeNetID@insy11:~$ nvidia-smi # check characteristics of GPU
Mon Jul 24 15:37:01 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla V100-SXM2-32GB            On | 00000000:88:00.0 Off |                    0 |
| N/A   32C    P0               40W / 300W|      0MiB / 32768MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
SomeNetID@insy11:~$
SomeNetID@insy11:~$ exit # exit the interactive session

Interactive jobs on compute nodes

To work interactively on a node, e.g., to debug a running code, or test on a GPU, start an interactive session using sinteractve <compute requirements>. If no parameters were provided, the default are applied. <compute requirement> can be specified the same way as sbatch directives within an sbatch script (see Submitting jobs), as in the examples below:

$ hostname # check you are in one of the login nodes
login1.daic.tudelft.nl
$ sinteractive 
 16:07:20 up 12 days, 4:09, 2 users, load average: 7.06, 7.04, 7.12
$ hostname # check you are in a compute node
insy15
$ squeue -u SomeNetID  # Replace SomeNetId with your NetID 
JOBID PARTITION  NAME     USER ST  TIME  NODES NODELIST(REASON)
    2   general  bash SomeNetI  R  1:23      1 insy15  
$ logout # exit the interactive job

To request a node with certain compute requirements:

$ sinteractive --ntasks=1 --cpus-per-task=2 --mem=4096
 16:07:20 up 12 days, 4:09, 2 users, load average: 7.06, 7.04, 7.12

Monitoring slurm jobs

  • To check your job has actually been submitted:
$ squeue -u SomeNetID  # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 2   general  jobscip SomeNetI  R       0:01      1 insy15
  • And to check the log of your job, use an editor or viewer of choice (eg, vi, nano or simply cat) to view the log:
$ cat slurm-2.out
JobId=2 JobName=jobscript.sbatch
   UserId=SomeNetId(123) GroupId=domain users(100513) MCS_label=N/A
   Priority=23909774 Nice=0 Account=ewi-insy QOS=short
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   DerivedExitCode=0:0
   RunTime=00:00:00 TimeLimit=00:01:00 TimeMin=N/A
   SubmitTime=2015-06-30T14:00:00 EligibleTime=2015-06-30T14:00:00
   AccrueTime=2015-06-30T14:00:00
   StartTime=2015-06-30T14:00:01 EndTime=2015-06-30T14:01:01 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2015-06-30T14:01:01  Scheduler=Main
   Partition=general AllocNode:Sid=login1:2220
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=insy15
   BatchHost=insy15
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
   TRES=cpu=2,mem=1G,node=1,billing=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   JOB_GRES=(null)
     Nodes=insy15 CPU_IDs=26-27 Mem=1024 GRES=
   MinCPUsNode=2 MinMemoryNode=1G MinTmpDiskNode=50M
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/home/nfs/SomeNetId/jobscript.sbatch
   WorkDir=/home/nfs/SomeNetId
   StdErr=/home/nfs/SomeNetId/slurm_2.err
   StdIn=/dev/null
   StdOut=/home/nfs/SomeNetId/slurm_2.out
   Power=
   MailUser=SomeNetId@tudelft.nl MailType=END

Sometimes, it may be desirable to inspect slurm jobs beyond their status in the queue. For example, to check which script was submitted, or how the resources were requested and allocated. Below are a few useful commands for this purpose:

  • See job definition
$  scontrol show job 8580148
JobId=8580148 JobName=jobscript.sbatch
   UserId=SomeNetID(123) GroupId=domain users(100513) MCS_label=N/A
   Priority=23721804 Nice=0 Account=ewi-insy QOS=short
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:00:12 TimeLimit=00:01:00 TimeMin=N/A
   SubmitTime=2023-07-10T06:41:57 EligibleTime=2023-07-10T06:41:57
   AccrueTime=2023-07-10T06:41:57
   StartTime=2023-07-10T06:41:58 EndTime=2023-07-10T06:42:58 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-10T06:41:58 Scheduler=Main
   Partition=general AllocNode:Sid=login1:19162
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=awi18
   BatchHost=awi18
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
   TRES=cpu=2,mem=1G,node=1,billing=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=2 MinMemoryNode=1G MinTmpDiskNode=50M
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/home/nfs/SomeNetID/jobscript.sbatch
   WorkDir=/home/nfs/SomeNetID
   StdErr=/home/nfs/SomeNetID/slurm_8580148.err
   StdIn=/dev/null
   StdOut=/home/nfs/SomeNetID/slurm_8580148.out
   Power=
   MailUser=SomeNetId@tudelft.nl MailType=END
   
  • See statistics of a running job
$ sstat 1
  JobID  AveRSS  AveCPU  NTasks  AveDiskRead AveDiskWrite
------- ------- ------- ------- ------------ ------------
1.0        426K 00:00.0       1        0.52M        0.01M
  • See accounting information of a finished job (also see –long option)
$ sacct -j 8580148
JobID           JobName  Partition    Account  AllocCPUS      State ExitCode 
------------ ---------- ---------- ---------- ---------- ---------- -------- 
8580148      jobscript+    general   ewi-insy          2  COMPLETED      0:0 
8580148.bat+      batch              ewi-insy          2  COMPLETED      0:0 

See overall job efficiency of a finished job

$ seff 8580148
Job ID: 8580148
Cluster: insy
User/Group: SomeNetID/domain users
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 2
CPU Utilized: 00:00:00
CPU Efficiency: 0.00% of 00:01:00 core-walltime
Job Wall-clock time: 00:00:30
Memory Utilized: 340.00 KB
Memory Efficiency: 0.03% of 1.00 GB

Cancelling jobs

  • And finally, to cancel a given job:
$ scancel <jobID>

4.2 - Priorities, Partitions, Quality of Service & Reservations

How to submit jobs to slurm?

Slurm’s job scheduling and waiting times

When slurm is not configured for FIFO scheduling, jobs are prioritized in the following order:

  1. Jobs that can preempt: Not enabled in DAIC
  2. Jobs with an advanced reservation: See Slurm's Advanced Resource Reservation Guide
  3. Partition PriorityTier: See Priority tiers
  4. Job priority: See Priority calculations and QoS priority
  5. Job ID

Priority tiers

DAIC partitions are tiered:

  • The general partition is in the lowest priority tier,
  • Department partitions (eg, insy, st) are in the middle priority tier, and
  • Partitions for specific groups (eg, influence, mmll) are in the highest priority tier. Those partitions correspond to resources contributed by the respective groups or departments (see Contributing departments).

When resources become available, the scheduler will first look for jobs in the highest priority partition that those resources are in, and start the highest (user) priority jobs that fit within the resources (if any). When resources remain, the scheduler will check the next lower priority tier, and so on. Finally, the scheduler will try to backfill lower (user) priority jobs that fit (if any).

The partition priorities have no impact on resources that are in use, so jobs have to wait until the resources become available.

Partition selection

The purpose of this tiering is to let you submit your jobs to multiple partitions (e.g., --partition=mml,insy,general), allowing the scheduler to determine where the job can start the soonest. This ensures your job has the highest possible priority across different partitions in the cluster, without negatively impacting your or others’ resource access.

Keep in mind that:

  • Resources of all partitions (eg, st) are also part of the general partition (see Fig 1). Thus:
    • Submitting to the general partition allows jobs to use all nodes
    • Submitting to group-specific partitions alone results in longer waiting times, since the general partition has much more resources than any of them (The bigger the resource pool, the more chances a job has to be scheduled or back-filled)
    • The optimal strategy is to submit to both general and group-specific partitions when accessible. This is to skip over higher-priority jobs that would otherwise get started first on resources that are also in the specific partition.
  • You should only submit jobs to partitions that your account has access to. Submitting jobs to unauthorized partitions (e.g., using --partition=insy,st when your submitting account does not have access to both of these) will result in the job remaining in a pending state and generate excessive logging, potentially overloading the Slurm controller nodes.
Correct: explicit default account and partition specification

#SBATCH --account=ewi-insy-prb
#SBATCH --partition=insy,general
Correct: Implicit default account omitted since it has access to the specified patition

#SBATCH --partition=insy,general
Incorrect: Multiple partitions with account mismatch

#SBATCH --account=ewi-insy-prb
#SBATCH --partition=insy,st  
Incorrect: Specifying a wrong account for the partition

#SBATCH --account=ewi-st
#SBATCH --partition=insy 

Priority calculations

Slurm continually calculates job priorities and schedules the execution of jobs based on its configurations. A few configuration parameters affect priority computations:

  • SchedulerType: The type of scheduling used based on available resources, requested resources, and job priorities. On DAIC, slurm is used with backfill scheduling mechanism. This mechanism allows low priority jobs to backfill idle resources if doing so does not delay the expected start time of any high priority job (based on resource availability).
  • PriorityType: The way priority is computed. On DAIC, a multifactor computation is applied, where job priority at any given time is a weighted sum of the following factors:
    • Fairshare: a measure of the amount of resources that a group (ie account in slurm terminology) has contributed, and the historical usage of the group and the user.
    • QOS: the quality of service associated with the job, which is specified with the slurm --qos directive (see QoS priority).

The following commands are useful for checking prioritization of your own jobs:

CommandPurpose
sprio -j <YourJobID>Determine the priority of your job
squeue -j <YourJobID> --startRequest your job’s estimated start time
sshare -u <YourNetID>Determine your current fairshare value

QoS priority

The purpose of the (multiple) QoSs in DAIC is to optimize the throughput of the cluster and to reduce the waiting times for jobs:

  • Long jobs block resources for a long time, thus leading to long waiting times and fragmentation of resources.
  • Short jobs block resources only for short times, and can more easily fill in the gaps in the scheduling of resources (thus start sooner), and are therefore better for throughput and waiting times.

Thus, DAIC has the following policy:

  • To stimulate short jobs, the short QoS has a higher priority, and allows you to use a larger part of all resources, than the medium and long QoS.

  • To prevent long jobs from blocking all resources in the cluster for long times (thus causing long waiting times), only a certain part of all cluster resources is available to all running long QoS jobs (of all users) combined.

  • All running medium QoS jobs together can use a somewhat larger part of all resources in the cluster, and all running short QoS jobs combined are allowed to fill the biggest part of the cluster.

    • These limits are called the QoS group limits.
    • When this limit is reached, no new jobs with this QoS can be started, until some of the running jobs with this QoS finish and release some resources.
    • The scheduler will indicate this with the reason QoS Group CPU/memory/GRES limit.
  • To prevent one user from single-handedly using all available resources in a certain QoS, there are also limits for the total resources that all running jobs of one user in a specific QoS can use.

    • These are called the QoS per-user limits.
    • When this limit is reached, no new jobs of this user with this QoS can be started, until some of the running jobs of this user and with this QoS finish and release some resources.
    • The scheduler will indicate this with the reason QoS User CPU/memory/GRES limit.

These per-group and per-user limits are set by the DAIC user board, and the scheduler strictly enforces these limits. Thus, no user can use more resources than the amount that was set by the user board. Any (perceived) imbalance in the use of resources by a certain QoS or user should not be held against a user or the scheduler, but should be discussed in the user board.

Partitions

In SLURM, a partition is a scheduling construct that groups nodes or resources based on certain characteristics or policies. Partitions are used to organize and manage resources within a cluster, and they allow system administrators to control how jobs are allocated and executed on different nodes.

See partition definitions

On DAIC the scontrol command only shows you the general partitions. More partitions are available.

$ scontrol show partition
PartitionName=general
   AllowGroups=ALL AllowAccounts=ALL DenyQos=influence
   AllocNodes=login[1-3],oodtest Default=YES QoS=N/A
   DefaultTime=00:01:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=3dgi[1-2],100plus,awi[01-26],cor1,gpu[01-11],grs[1-4],influ[1-6],insy[11-16],tbm5,wis1
   PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
   OverTimeLimit=NONE PreemptMode=OFF
   State=UP TotalCPUs=4064 TotalNodes=59 SelectTypeParameters=NONE
   JobDefaults=(null)
   DefMemPerNode=1024 MaxMemPerNode=UNLIMITED
   TRESBillingWeights=CPU=0.5,Mem=0.083333333G,GRES/gpu=16.0

Quality of Service (QoS)

When you submit a job in a slurm-based system, it enters a queue waiting for resources. The partition and Quality of Service(QoS) are the two job parameters slurm uses to assign resources for a job:

  • The partition is a set of compute nodes on which a job can be scheduled. In DAIC, the nodes contributed or funded by a certain group are lumped into a corresponding partition (see Contributing departments). All nodes in DAIC are part of the general partition, but other partitions exist for prioritization purposes on select nodes (see Priority tiers).
  • The Quality of Service is a set of limits that controls what resources a job can use and, therefore, determines the priority level of a job. This includes the run time, CPU, GPU and memory limits on the given partition. Jobs that exceed these limits are automatically terminated (see QoS priority).

For DAIC, Table 1 shows the QoS limits on the general partition.

Table 1: The general partition and its operational and per-QoS per-user limits; specific groups use other partitions and QoS
*infinite QoS jobs will be killed when servers go down, eg, during maintenance. It is not recommended to submit jobs with this QoS.
PartitionQoSPriorityMax run timeJobs per userCPU limitsGPU limitsMemory limits
Per QoSPer userPer QoSPer userPer QoSPer User
generalinteractivehigh1 hour1 running-2-2-16G
shortnormal4 hours100003672 (85%)2160 (50%)109 (85%)64 (50%)23159G (85%)13623G (50%)
mediummedium1 ½ day20003456 (80%)1512 (35%)103 (80%)45 (35%)21796G (80%)9536G (35%)
longlow7 days10003240 (75%)864 (20%)96 (75%)25 (20%)20434G (75%)5449G (20%)
infinite*noneinfinite1 running32-2-250G-

See Quality of Service definitions

On DAIC you can check the QoS policies with the sacctmgr command:

$ sacctmgr list qos
      Name   Priority  GraceTime    Preempt   PreemptExemptTime PreemptMode                                    Flags UsageThres UsageFactor       GrpTRES   GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit     GrpWall       MaxTRES MaxTRESPerNode   MaxTRESMins     MaxWall     MaxTRESPU MaxJobsPU MaxSubmitPU     MaxTRESPA MaxJobsPA MaxSubmitPA       MinTRES 
---------- ---------- ---------- ---------- ------------------- ----------- ---------------------------------------- ---------- ----------- ------------- ------------- ------------- ------- --------- ----------- ------------- -------------- ------------- ----------- ------------- --------- ----------- ------------- --------- ----------- ------------- 
    normal          0   00:00:00                                    cluster                              DenyOnLimit               1.000000                                                                                                                                                                                                                cpu=1 
     short         50   00:00:00                                    cluster                              DenyOnLimit               1.000000 cpu=3562,gre+                                         65536                                                           04:00:00 cpu=2096,gre+                 10000                                      cpu=1,mem=1M 
      long         25   00:00:00                                    cluster                              DenyOnLimit               1.000000 cpu=3144,gre+                                         65536                                                         7-00:00:00 cpu=838,gres+                  1000                                      cpu=1,mem=1M 
  infinite          0   00:00:00                                    cluster                              DenyOnLimit               1.000000 cpu=32,gres/+                                         65536                                                                                          1         100                                      cpu=1,mem=1M 
interacti+        100   00:00:00                                    cluster                              DenyOnLimit               2.000000                                                       65536                                                           01:00:00 cpu=2,gres/g+         1           1                                      cpu=1,mem=1M 
   student         10   00:00:00                                    cluster                              DenyOnLimit               1.000000 cpu=192,gres+                                         65536                                                           04:00:00 cpu=2,gres/g+         1         100                                      cpu=1,mem=1M 
reservati+        100   00:00:00                                    cluster          DenyOnLimit,RequiresReservation               1.000000                                                       65536                                                                                                  10000                                      cpu=1,mem=1M 
 influence        100   00:00:00                                    cluster                              DenyOnLimit               1.000000                                                       65536                                                                                                  10000                                      cpu=1,mem=1M 
guest-sho+         10   00:00:00                                    cluster                              DenyOnLimit               1.000000 cpu=200,gres+                                         65536                                                           04:00:00 cpu=128,gres+                   100                                      cpu=1,mem=1M 
guest-long          0   00:00:00                                    cluster                              DenyOnLimit               1.000000 cpu=200,gres+                                         65536                                                         7-00:00:00 cpu=128,gres+         1          10                                      cpu=1,mem=1M 
    medium         35   00:00:00                                    cluster                              DenyOnLimit               1.000000 cpu=3352,gre+                                         65536                                                         1-12:00:00 cpu=1466,gre+                  2000                                      cpu=1,mem=1M 

How to use QoS in your sbatch scripts?

In your sbatch.slurm script you can specify the QoS with #SBATCH --qos=... option.

Example:

#!/bin/bash
#SBATCH --job-name=hello-world
#SBATCH --partition=general
#SBATCH --account=ewi-insy-reit
#SBATCH --qos=short               # This is how you specify QoS
#SBATCH --time=0:01:00     
#SBATCH --nodes=1        
#SBATCH --tasks-per-node=1        
#SBATCH --cpus-per-task=2        
#SBATCH --mem=1GB                
#SBATCH --output=slurm-%n-%j.out  
#SBATCH --error=slurm-%n-%j.err

srun echo 'Hi, from Slurm!'
sleep 30  # Wait for 30 seconds before exiting.

QoS for reservations

In case you have a reservation you need to specify --qos=reservation and `–reservation=. You can find an example here.

Resources reservations

Slurm gives the possibility to reserve one or more compute nodes exclusively for a specific user or group of users. A reservation ensures that the designated node (or nodes) are dedicated solely to the reservation holder’s tasks and are not shared with other users during the reserved period. This feature allows users to plan the execution of future workloads, and accommodates cluster users with special needs beyond the batch system (eg latency measurement scenarios).

Requesting a Reservation

To request a reservation for nodes, please use to the Request Reservation form. You can request a reservation for an entire compute node (or a group of nodes) if you have contributed this (or these) nodes to the cluster and you have special needs that needs to be accommodated.

General guidelines for reservations’ requests:

  • You can be granted a reservation only on nodes from a partition that is contributed by your group (See Partitions to check the name of the partition contributed by your group, and System specifications for a listing of available nodes and their features).
  • Please ask for the least amount of resources you need as to minimize impact on other users.
  • Plan ahead and request your reservation as soon as possible: Reservations usually ignore running jobs, so any running job on the machine(s) you request will continue to run when the reservation starts. While jobs from other users will not start on the reserved node(s), the resources in use by an already running job at the start time of the reservation will not be available in the reservation until this running job ends. The earlier ahead you request resources, the easier it is to allocate the requested resources.

Using reservations

Once your reservation request is approved and a reservation is placed on the system, you can run your jobs in the reservation by specifying --qos=reservation along with the following directives to your slurm commands: --reservation=<name> and --partition=<partition>. For example, to submit the job job.sbatch to a reservation named icra_iv on the cor1 node on the cor partition use:

$ sbatch --qos=reservation --reservation=icra_iv --partition=cor job.sbatch

Alternatively, it is possible to add the following lines to the job.sbatch file, and submitting this file as usual:

#SBATCH --qos=reservation
#SBATCH --reservation=icra_iv
#SBATCH --partition=cor

To make use of an existing reservation you have to specify --qos=reservation and --reservation=<reservation-name> in your sbatch script.

Viewing reservations

To view all active and future reservations run the scontrol command as follows:

$ scontrol show reservations
ReservationName=icra_iv StartTime=2023-09-09T00:00:00 EndTime=2023-09-16T00:00:00 Duration=7-00:00:00
   Nodes=cor1 NodeCnt=1 CoreCnt=32 Features=(null) PartitionName=cor Flags=
   TRES=cpu=64
   Users=(null) Groups=(null) Accounts=3me-cor Licenses=(null) State=ACTIVE BurstBuffer=(null) Watts=n/a
   MaxStartDelay=(null)

ReservationName=maintenance weekend 2023-10-14 StartTime=2023-10-13T20:00:00 EndTime=2023-10-16T09:00:00 Duration=2-13:00:00
   Nodes=3dgi[1-2],100plus,awi[01-26],cor1,gpu[01-11],grs[1-4],influ[1-6],insy[11-12,14-16],tbm5,wis1 NodeCnt=58 CoreCnt=2000 Features=(null) PartitionName=(null) Flags=MAINT,IGNORE_JOBS,SPEC_NODES,ALL_NODES
   TRES=cpu=4000
   Users=root Groups=(null) Accounts=(null) Licenses=(null) State=INACTIVE BurstBuffer=(null) Watts=n/a
   MaxStartDelay=(null)

4.3 - Advanced Slurm jobs

How to submit jobs to slurm?

Parallelizing jobs with Job Arrays

There can be scenarios, eg in simulations or benchmarking, where a job script needs to run many times with only different parameter set each time. If done manually, keeping track of the parameter values and corresponding jobIds is cumbersome. Job Arrays are a convenient mechanism for submitting and managing such jobs.

A job array is created by adding the --array=<indexes> directive to an sbatch script (or in the command line), where <indexes> can be either a comma separated list of integers, or a range with optional step size, eg, 1-10:2. The minimum index value is 0, and the maximum is a Slurm configuration parameter (MaxArraySize - 1).

Within a job array, all jobs have the same SLURM_ARRAY_JOB_ID, but each job will have its own environment variable SLURM_ARRAY_TASK_ID that corresponds to the array index value. Additionally, all jobs in the array inherit the same compute resources requirements. In the following examples, arrays of size 2 are created, but with different indexes:

$ sbatch --array=1,4 jobscript.sbatch # Indexes specified as a list, and have values 1 and 4
Submitted batch job 8580151
$
$ squeue -u SomeNetID  # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
         8580151_1   general jobscrip SomeNetID  R       0:01      1 grs4
         8580151_4   general jobscrip SomeNetID  R       0:01      1 awi18
$ sbatch --array=1-2 jobscript.sbatch  # Range specified with default step size = 1. Index have values 1 and 2
Submitted batch job 8580149
$
$ squeue -u SomeNetID  # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
         8580149_1   general jobscrip SomeNetID  R       0:21      1 grs4
         8580149_2   general jobscrip SomeNetID  R       0:21      1 awi18

JobId and environment variables

As shown in the previous section, Parallelizing jobs with job arrays, jobs within an array are assigned special slurm variables. These variables can be exploited for various computational objectives. Among these, SLURM_ARRAY_TASK_ID is the index of an individual task within the array, and SLURM_ARRAY_JOB_ID is the slurm jobId of the entire array job.

In the simplest case, you can use the ${SLURM_ARRAY_TASK_ID} directly in a script to assign parameter values. For example, to run a workflow across a set of images image_1.pngimage_5.png, you can simply create an array using the sbatch directive --array=1-5, and then, within your sbatch script, use image_${SLURM_ARRAY_TASK_ID}.png to indicate the corresponding image.

In more complex scenarios, eg, when the parameters of interest are not mappable to indexes (of a job array), you can use a config file to map the parameters to the job array indexes. For example, let’s assume the following parameters:

$ cat jobarray.config
i       Flower  Color   Origin  
1       Rose    Red     Worldwide
2       Jasmine  White   Asia
3       Tulip   Various Persia&Turkey
4       Orchid  Various Worldwide
5       Lily    Various Worldwide

Now, you can use these parameters inside a job script as follows:

$ cat jobarray.sbatch
#!/bin/bash
#SBATCH --job-name=JobArrayExample
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --array=1-5             # Arry with 5 tasks
#SBATCH --output=slurm-%A_%a.out # Set name of output log. %A is SLURM_ARRAY_JOB_ID and %a is SLURM_ARRAY_TASK_ID
#SBATCH --error=slurm-%A_%a.err  # Set name of error log. %A is SLURM_ARRAY_JOB_ID and %a is SLURM_ARRAY_TASK_ID

config=jobarray.config          # Path to config file

# Obtain parameters from config file:
flower=$(awk -v ArrayTaskID=$SLURM_ARRAY_TASK_ID '$1==ArrayTaskID {print $2}' $config)
color=$(awk -v ArrayTaskID=$SLURM_ARRAY_TASK_ID '$1==ArrayTaskID {print $3}' $config)
origin=$(awk -v ArrayTaskID=$SLURM_ARRAY_TASK_ID '$1==ArrayTaskID {print $4}' $config)

# Use the parameters, eg, print the index and parameter values to a file:
echo "Array task: ${SLURM_ARRAY_TASK_ID},  Flower: ${flower}, color: ${color}, origin: ${origin}" >> output.txt

$
$ sbatch jobArray.sbatch
Submitted batch job 8580317
$ squeue -u SomeNetID  # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
     8580317_[1-5]   general JobArray SomeNetID PD       0:00      1 (Priority)     

In this example, slurm created 5 jobs in a job array, each using the same settings (the name JobArrayExample, the general partition, short QoS, 00:01:00 time, 1 task with 1 CPU and 1G memory, and an output and error file with both array job Id and task id). Each task looks up certain parameter values from a config file leveraging its index via the awk command.

Jobs within a task array are run in parallel, and hence, there’s no guarantee about their order of execution. This is evident looking at the output file from this example:

$ cat output.txt
Array task: 2,  Flower: Jasmine, color: White, origin: Asia
Array task: 3,  Flower: Tulip, color: Various, origin: Persia&Turkey
Array task: 1,  Flower: Rose, color: Red, origin: Worldwide
Array task: 5,  Flower: Lily, color: Various, origin: Worldwide
Array task: 4,  Flower: Orchid, color: Various, origin: Worldwide

Other slurm variables that are set inside a job array are shown in the following table, with values based on the preceding example:

Slurm Environment VariableDescriptionValue in example
SLURM_ARRAY_JOB_IDThe first job ID of the array.8580317
SLURM_ARRAY_TASK_IDThe job array index value.A value in range 1-5
SLURM_ARRAY_TASK_COUNTThe number of tasks in the job array.5
SLURM_ARRAY_TASK_MAXThe highest job array index value.5
SLURM_ARRAY_TASK_MINThe lowest job array index value1

Slurm commands and job arrays

The squeue command reports all submitted jobs. By default, squeue reports all of the tasks associated with a job array in one line and uses a regular expression to indicate the SLURM_ARRAY_TASK_ID values. To explicitly print one job array element per line, use the --array or -r flag. The following examples highlight the difference, using the same jobarray.sbatch file from the JobId and environment variables section:

$ sbatch jobarray.sbatch 
Submitted batch job 8593299
$
$ squeue -u SomeNetID  # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
     8593299_[1-5]   general JobArray SomeNetID PD       0:00      1 (Priority)
$     
$ squeue -r -u SomeNetID  # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
         8593299_1   general JobArray SomeNetID PD       0:00      1 (Priority)
         8593299_2   general JobArray SomeNetID PD       0:00      1 (Priority)
         8593299_3   general JobArray SomeNetID PD       0:00      1 (Priority)
         8593299_4   general JobArray SomeNetID PD       0:00      1 (Priority)
         8593299_5   general JobArray SomeNetID PD       0:00      1 (Priority)

scancel, on the other hand, can be used to cancel an entire job array by specifying its SLURM_ARRAY_JOB_ID. Alternatively, to cancel a specific task (or tasks), both its SLURM_ARRAY_JOB_ID and SLURM_ARRAY_TASK_ID must be specified, possibly with a regular expression, as shown in the following examples:

$ sbatch jobarray.sbatch
$ squeue -u SomeNetID  # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
     8593321_[1-5]   general JobArray SomeNetID PD       0:00      1 (Priority)
$     
$ scancel 8593321_4     # Cancel task with index 4 in the array
$ squeue -u SomeNetID   # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
   8593321_[1-3,5]   general JobArray SomeNetID PD       0:00      1 (Priority)
$
$ scancel 8593321_[1-3] # Cancel tasks in index range 1-3 in the array
$ squeue -u SomeNetID   # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
         8593321_5   general JobArray SomeNetID PD       0:00      1 (Priority)
$
$ scancel 8593321       # Cancel all tasks in the array
$ squeue -u SomeNetID  # Replace SomeNetId with your NetID 
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
$ 

Deploying dependent jobs (job chains)

In certain scenarios, it might be desirable to condition the execution of a certain job on the status of another job. In such cases, the sbatch directive --dependency=<condition>:<jobID> can be used, where <condition> specifies the type of dependency (See table 2), and <jobID> is the slurm jobID upon which dependency is based. To specify more than one dependency, the , separator is used to indicate that all dependencies must be specified, and, ? is used denotes that any dependency may be satisfied.

For example, assume the slurm job scripts, job_1.sbatch, … job_3.sbatch need to run sequentially one after the other. To start this chain, submit the first job and obtain its jobID:

$ sbatch job_1.sbatch
Submitted batch job 8580135

Next, submit the second job to run only if the first job is successful:

$ sbatch --dependency=afterok:8580135 job_2.sbatch
Submitted batch job 8580136

And, now, to run the third job only after the first two jobs have both run successfully:

$ sbatch --dependency=afterok:8580135,8580136 job_3.sbatch
Submitted batch job 8580140

Alternatively, if the third job is dependent on either job running successfully:

$ sbatch --dependency=afterok:8580135?8580136 job_3.sbatch
Submitted batch job 8580141
Table 2: Possible sbatch dependency conditions
ArgumentDescription
afterThis job can begin execution after the specified jobs have begun execution
afteranyThis job can begin execution after the specified jobs have terminated.
aftercorrA task of this job array can begin execution after the corresponding task ID in the specified job has completed successfully
afternotokThis job can begin execution after the specified jobs have terminated in some failed state
afterokThis job can begin execution after the specified jobs have successfully executed
singletonThis job can begin execution after any previously launched jobs sharing the same job name and user have terminated

Troubleshooting Common Issues

Please see the Frequently asked questions on Scheduler problems and Job resources

4.4 - Kerberos

How to submit jobs to slurm?

Kerberos Authentication

Kerberos is an authentication protocol which uses tickets to authenticate users (and computers). You automatically get a ticket when you log in with your password on a TU Delft installed computer. You can use this ticket to authenticate yourself without password when connecting to other computers or accessing your files. To protect you from misuse, the ticket expires after 10 hours or less (even when you’re still logged in).

File access

Your Linux and Windows Home directories and the Group and Project shares are located on network fileservers, which allows you to access your files from all TU Delft installed computers. Kerberos authentication is used to enable access to, or protect, your files. Without a valid Kerberos ticket (e.g. when the ticket has expired) you will not be able to access your files but instead you will receive a Permission denied error.

Lifetime of Kerberos Tickets

Kerberos tickets have a limited valid lifetime (of up to 10 hours) to reduce the risk of abuse, even when you stay logged in. If your tickets expire, you will receive a Permission Denied error when you try to access your files and a password prompt when you try to connect to another computer. When you want your program to be able to access your files for longer than the valid ticket lifetime, you’ll have to renew your ticket (repeatedly) until your program is done. Kerberos tickets can be renewed up to a maximum renewable life period of 7 days (again to reduce the risk of abuse).

The command klist -5 lists your cached Kerberos tickets together with their expiration time and maximum renewal time:

$ klist -5
Ticket cache: FILE:/tmp/krb5cc_uid_random
Default principal: YourNetID@TUDELFT.NET

Valid starting     Expires            Service principal
01/01/01 00:00:00  01/01/01 10:00:00  krbtgt/TUDELFT.NET@TUDELFT.NET
        renew until 01/08/01 00:00:00

Where:

  • Ticket cache: The Kerberos tickets that have been issued to you are stored in a ticket cache file. You can have multiple ticket cache files on the same computer (from different connections, for example) with different tickets and ticket expiration times. Some ticket cache files are automatically removed when you logout.

  • Default principal: Your identity.

  • Service principal: The identity of services that you have gotten tickets for. You always need a Kerberos ticket-granting ticket (krbtgt) in order to obtain other tickets for specific services like accessing files (nfs) or connecting to computers (host).

  • Valid starting, Expires: Your ticket is only valid between these times (this period is called the valid lifetime). After this time you will not be able to use the service nor automatically renew the ticket (without password).

  • Renew until: Your ticket can only be renewed without password up to this time. After this time you will have to obtain a new ticket using your password.

Renewing Kerberos tickets

If you have a valid Kerberos krbtgt ticket, you can renew it at any time (until it expires) by running the command kinit -R:

$ kinit -R
$ klist -5
Ticket cache: FILE:/tmp/krb5cc_uid_random
Default principal: YourNetID@TUDELFT.NET

Valid starting     Expires            Service principal
01/01/01 01:00:00  01/01/01 11:00:00  krbtgt/TUDELFT.NET@TUDELFT.NET
        renew until 01/08/01 00:00:00

When the krbtgt ticket has expired or reached it’s renew until time, you will have to obtain a new ticket by running kinit -r 7d (note the difference in case for the r) and authenticating with your password:

$ kinit -r 7d
Password for YourNetID@TUDELFT.NET:
$ klist -5
Ticket cache: FILE:/tmp/krb5cc_uid_random
Default principal: YourNetID@TUDELFT.NET

Valid starting     Expires            Service principal
01/01/01 11:00:00  01/01/01 21:00:00  krbtgt/TUDELFT.NET@TUDELFT.NET
        renew until 01/08/01 11:00:00

The new ticket will have a valid lifetime of 10 hours and a renewable life of 7 days.

On the TU Delft Linux desktops your Kerberos ticket is refreshed (i.e. replaced by a new ticket) automatically every time you enter your password for unlocking the screen saver.

On remote computers you have to manually renew your tickets before they expire.

Slurm & Kerberos

  • Slurm caches your Kerberos ticket, and uses it to execute your job
  • Regularly renew the ticket in Slurm’s cache while your jobs are queued or running:
$ auks -a
Auks API request succeed
  • To automatically renew your ticket in Slurm’s cache until you change your NetID password, run the following on the login1 node:
$ install_keytab
Password for somebody@TUDELFT.NET:
Installed keytab.

You need to rerun this command whenever you change your NetID password (at least every 6 months). Otherwise, the automatic renewal will not work and you will receive a warning e-mail.

Renewal using screen

On the compute nodes, the screen program has been modified to allow jobs to run unattended for up to 7 days. It creates a private ticket cache (to prevent the cache from being destroyed at logout) and automatically renews your ticket up to the maximum renewable life. For example, start MATLAB in Screen with screen matlab (the order is important!).

$ screen matlab
Warning: No display specified.  You will not be able to display graphics on the screen.

                           < M A T L A B (R) >
                 Copyright 1984-2010 The MathWorks, Inc.
              Version 7.11.0.584 (R2010b) 64-bit (glnxa64)
                             August 16, 2010


  To get started, type one of these: helpwin, helpdesk, or demo.
  For product information, visit www.mathworks.com.

>>

For longer jobs you have to manually obtain a new ticket at least every 7 days by running kinit -r 7d from within screen (so you use the specific ticket cache file that screen is using):

  1. connect to screen (screen -r),
  2. create a new window (Ctrl-a c),
  3. run kinit -r 7d,
  4. exit the window (exit) and
  5. detach from screen (Ctrl-a d).
$ kinit -r 7d
Password for YourNetID@TUDELFT.NET:
$ klist -5
Ticket cache: FILE:/tmp/krb5cc_uid_private
Default principal: YourNetID@TUDELFT.NET

Valid starting     Expires            Service principal
01/08/01 09:00:00  01/08/01 19:00:00  krbtgt/TUDELFT.NET@TUDELFT.NET
        renew until 01/15/01 09:00:00
$ exit

Extra functionality can be provided by the k5start and krenew programs. On most computers these are not available by default but can be installed.

5 - Best practices

What is acceptable usage of DAIC?

The available processing power and memory in DAIC is large, but still limited. You should use the available resources efficiently and fairly. This page lays out a few general principles and guidelines for considerate use of DAIC.

Using shared resources

The computing nodes within DAIC are primarily meant to run large, long (non-interactive) jobs. You share these resources with other users across departments. Thus, you need to be cautious of your usage so you do not hinder other users.

To help protect the active jobs and resources, when a login node becomes overloaded, new logins to this node are automatically disabled. This means that you will sometimes have to wait for other jobs to finish and at other times ICT may have to kill a job to create space for other users.

One rule: Respect your fellow users.

Implication: we reserve the right to terminate any job or process that we feel is clearly interfering with the ability of others to complete work, regardless of technical measures or its resource usage.

Best practices

  • Connect only directly from the bastion server to the login nodes (See Connecting to DAIC)
  • Always choose the login node with the lowest use (most importantly system load and memory usage), by checking the Current resource usage page or the servers command for information.
    • Each login node displays a message at login. Make sure you understand it before proceeding. This message includes the current load of the node, so look at it at every login
  • Only use the storage best suited to your files (See Storage).
  • Do interactive code development, debugging and testing in your local machine, as much as possible. In the cluster, try to organize your code as scripts, instead of working interactively in the command line.

  • If you need to test and debug in the cluster, for example, in a GPU node, request an interactive session and do not work in the login node itself (See Interactive jobs on compute nodes).

  • Save results frequently: your job can crash, the compute node can become overloaded, or the network shares can become unavailable.

  • Write your code in a modular way, so that you can continue the job from the point where it last crashed.

  • Actively monitor the status of your jobs:
    • Make sure your job runs normally and is not hindering other jobs. Check the following at the start of a job and thereafter at least twice a day:
      • If your job is not working correctly (or halted) because of a programming error, terminate it immediately; debug and fix the problem instead of just trying again (the result will almost certainly be exactly the same).
      • If your screen’s Kerberos ticket has expired, renew it so your job can successfully save it’s results.
      • Use the top program to monitor the cpu (%CPU) and memory (%MEM) usage of your code. If either is too high, kill your code so it doesn’t cause problems for other users.
      • Don’t leave top running unless your are continuously watching it; press q to quit.
      • Watch the current resource usage (see Current resource usage page or use the servers command), and if the node is running close to it’s limits (higher than 90% load or memory, swap or disk usage), consider moving your job to a less busy node.

Computing on login nodes

  • You can use login nodes for basic tasks like compiling software, preparing submission scripts for the batch queue, submitting and monitoring jobs in the batch queue, analyzing results, and moving data or managing files.

  • Small-scale interactive work may be acceptable on login nodes if your resource requirements are minimal.

  • Please do not run production research computations on the login nodes. If necessary, request an interactive session in these cases (See Interactive jobs on compute nodes)

6 - Handy commands on DAIC

Brief description of useful commandline tools.

BASH commands

BASH (Bourne Again SHell) is an open-source Unix shell and command language. It is the default shell on many Linux distributions and macOS, and it’s available on Windows via the Windows Subsystem for Linux, Git BASH, and other emulators. BASH is widely used for scripting and automating tasks in a computing environment. Below are some fundamental BASH commands with examples and brief explanations, aiding users in effective navigation and task execution. Remember to use these commands carefully, especially those that can modify or delete files and directories. They are fundamental tools for interacting with BASH and managing your tasks effectively.

man

The man command is a tool for displaying the manual pages (documentation) of various commands and utilities available on Unix-like operating systems. It is an essential resource for users seeking detailed information about a specific command, program, or configuration file.

Basic Usage

Display the manual page for a command:

man <command>

This displays the manual page for the specified command.

Examples

Show the manual page for the ls command:

man ls

Show the manual page for the man command:

man man

echo

Used for displaying a line of text/string that is passed as an argument. This is a fundamental command for displaying output in shell scripts.

Example: Display “Hello, World!”.

echo "Hello, World!"

cd

Changes the current directory to another directory. It’s a basic command to navigate through the filesystem.

Example: Change to the home directory.

cd ~

ls

Lists the contents of a directory. It’s a key command to view files and directories.

Example: List all files and directories in the current directory, including hidden files.

ls -a

tree

The tree command is a utility that displays the directory structure of a path in a tree-like format. It provides a visual representation of the hierarchy of files and directories, making it easier to understand the organization of a file system.

Basic Usage

  • Display the directory tree structure:

    tree [path]
    

    This command displays the directory structure starting from the specified path or the current directory if no path is specified.

Options

  • -a: Display all files and directories, including hidden ones (those starting with a dot).
  • -d: Display only directories, omitting files.
  • -L level: Limit the depth of the tree to the specified level.
  • --noreport: Suppress the file and directory count summary at the end of the output.
  • -H baseHREF: Create an HTML output starting with the specified base URL.
  • -o filename: Output the tree structure to a file with the specified name.
  • --charset encoding: Use the specified character encoding (e.g., UTF-8).
  • -P pattern: Only display files matching the specified pattern (e.g., *.txt).
  • -I pattern: Exclude files and directories matching the specified pattern (e.g., *.bak).

Examples

  • Display the directory tree structure starting from the current directory:

    tree
    
  • Display the directory tree structure from a specific path:

    tree /path/to/start
    
  • Display only directories in the tree structure:

    tree -d
    
  • Display the tree structure and limit the depth to 2 levels:

    tree -L 2
    
  • Display the tree structure and output it to a file:

    tree -o output.txt
    
  • Display all files and directories, including hidden ones:

    tree -a
    

The tree command is a helpful tool for quickly understanding the layout of a directory and its contents. It is especially useful for navigating complex file systems and identifying the location of files and directories within a hierarchy.

which

The which command shows the full path of a command’s executable file by searching the directories listed in the PATH environment variable.

Basic Usage

  • Find the path of a command:

    which command
    

    This displays the full path of the specified command’s executable file.

Examples

  • Find the path of the ls command:

    which ls
    
  • Find the path of the python command:

    which python
    

whereis

The whereis command locates not only the executable file but also the source and manual page files of a command, if available.

Basic Usage

  • Locate a command:

    whereis command
    

    This displays the paths to the executable, source, and manual page files of the specified command, if they exist.

Options

  • -b: Search only for binaries (executable files).
  • -m: Search only for manual pages.
  • -s: Search only for source files.
  • -u: Search for any missing information (binaries, source, or manual) and report it.
  • -B path: Add a directory to the search path for binaries.
  • -M path: Add a directory to the search path for manual pages.
  • -S path: Add a directory to the search path for source files.

Examples

  • Locate the ls command:

    whereis ls
    
  • Locate the gcc command and its source files:

    whereis -s gcc
    

cat

Concatenates and displays file contents. It’s commonly used to view the contents of a file.

Example: Display the contents of a file named example.txt.

cat example.txt

grep

Searches for patterns in files. It’s a powerful tool for searching text using patterns.

Example: Search for the word “example” in file.txt.

grep "example" file.txt

find

Searches for files in a directory hierarchy. This command is essential for locating files and directories.

Example: Find all .txt files in the current directory.

find . -name "*.txt"

mkdir

Creates a new directory.

Example: Create a directory named new_directory.

mkdir new_directory

rm

Removes files or directories. It’s a critical command for file management.

Example 1: Remove a file named example.txt.

rm example.txt

Example 2: Remove a directory and its contents (recursively).

rm -r directory_name

Warning: Be extremely cautious with rm -r, especially when used with . (current directory) or .. (parent directory), as this can lead to irreversible deletion of files. Never use rm -r . in a directory unless you are absolutely sure about deleting all its contents.

cp

Copies files and directories.

Example: Copy file1.txt to file2.txt.

cp file1.txt file2.txt

mv

Moves or renames files and directories.

Example: Rename oldname.txt to newname.txt.

mv oldname.txt newname.txt

for, do, done

A for loop in Bash allows you to iterate over a list of items, such as an array, a set of files, or even a range of numbers. Below, I will provide you with a few examples of how you can use a for loop in Bash.

Iterating over a list of strings

In this example, the for loop iterates over a list of strings and prints each one:

# List of items
items=("apple" "banana" "cherry")

# Loop through each item
for item in "${items[@]}"; do
    echo "Item: $item"
done

if, (else), then

The if statement in Bash scripting is used to execute a block of code conditionally based on whether an expression evaluates to true or false. Below are examples of how you can use an if statement in Bash:

filepath="/path/to/file.txt"

if [ -f "$filepath" ]; then
    echo "The file exists."
else
    echo "The file does not exist."
fi

alias

In Bash, an alias is a shortcut for a command. You can define an alias to simplify the execution of commonly used commands or to add default options to commands you frequently use. Here are some examples of how to create and use aliases in Bash:

Creating a simple alias

You can create an alias by using the alias command followed by the alias name and the command it represents. Here’s an example of a simple alias:

alias ll="ls -l"

Another commonly used alias is md as a shortcut for mkdir:

alias md="mkdir"

You can add these instructions to your .bashrc file in order to load them when logging in to the cluster.

Slurm commands

SLURM (Simple Linux Utility for Resource Management) is an open-source job scheduler used on many of the world’s supercomputers and compute clusters. It allows users to efficiently manage computing resources and queue their computational jobs for execution. Below are some essential SLURM commands with examples and brief explanations, helping users navigate and utilize these resources effectively. Remember to replace <jobid> with your specific job ID where necessary. These commands are vital tools for interacting with SLURM and managing your compute tasks effectively.

sinteractive

For requesting an interactive node, typically during testing phases. Compute resources such as memory, time, and GPUs are specified as part of the command, similar to sbatch directives.

Example: Request a 10-minute GPU node session.

sinteractive --time=00:10:00 --gres=gpu

sbatch

Used for submitting a script to SLURM for queuing in batch mode. The script includes directives at the top to specify required resources.

Example: Submit a job using a script named script.sh.

sbatch script.sh

squeue

Checks the status of jobs in the SLURM queue. Useful for tracking your job’s status and understanding the queue’s state, and to find a specific jobid of a particular job.

Example: Check the status of all your queued jobs.

squeue -u $USER

scancel

Cancels a job or all jobs of a user. Vital for managing jobs that are no longer needed or were submitted in error.

Example 1: Cancel a specific job with job ID <jobid>.

scancel <jobid>

Example 2: Cancel all jobs for the current user.

scancel -u $USER

slurmtop

A DAIC-specific command to view the top jobs in the queues and their resource usage.

Example:

slurmtop

scontrol

Shows detailed information and resources allocated to the job with the specified SLURM job ID.

Example: Show details of a job with job ID jobid.

scontrol show job <jobid>

sinfo

Displays information about SLURM nodes and partitions. Key command for understanding the state of the cluster.

Example: Display information about all nodes and partitions.

sinfo

sacct

Displays accounting data for all jobs and job steps. Useful for tracking resource usage and performance metrics. Example: Display accounting data for all jobs.

sacct --format=JobID,JobName%30,State,Elapsed,Timelimit,AllocNodes,Priority,Start,NodeList

Other

module

In the context of Unix-like operating systems, the module command is part of the environment modules system, a tool that provides a dynamic approach to managing the user environment. This system allows users to load and unload different software packages or environments on demand.

Basic Usage

  • Load a module:

    module load module-name
    

    This command loads the specified module, setting up the environment variables and paths needed for the software package.

  • Unload a module:

    module unload module-name
    

    This command unloads the specified module, removing any environment variables and paths associated with it.

For a more detailed description of module see Modules.