Data Processing on Levante#

A part of the Levante cluster is reserved for interactive data processing and analysis. These hardware resources form the Slurm partition named interactive. They can not be directly accessed via ssh, but should be reserved using the Slurm salloc command from one of the login nodes (levante.dkrz.de). This approach prevents programs of individual users from impacting the sessions of other users as was often the case on the mistralpp nodes of the HLRE-3 system.

How to start an interactive session#

Starting an interactive session is accompished with a single command, that reserves resources and jumps directly to the assigned node:

$ salloc --x11 -p interactive -A xz0123 -n 1 -t 240

The option --x11 in the above command sets up X11 forwarding needed to use GUI applications. Please, take care to adapt the settings like project account (-A), number of tasks (-n), wall-clock time limit (-t) to your actual needs.

To receive more memory for your session, you can either increase the number of tasks:

$ salloc --x11 -p interactive -A xz0123 -n 10 -t 240

Or, you can use the --mem or --mem-per-cpu options of the salloc command, for example:

$ salloc --x11 -p interactive -A xz0123 --mem=5760 -t 240

For further details please refer to the salloc(1) manual page:

$ man salloc

The name of the allocated node is set by SLURM in the environment variable SLURM_JOB_NODELIST:

$ echo $SLURM_JOB_NODELIST
l10160

As long as your job allocation is active, you can open further interactive sessions on the allocated node using the ssh command from a login node:

$ ssh -X l10160

An interactive session can be terminated by invoking the exit command from the shell spawned by salloc.

Submitting a batch job#

Interactive sessions started with the salloc command should only be used if your data processing steps really require interactive control on your part. Usually, interactive sessions will be suddenly terminated if one of the following fails:

  • the issuing host (levante.dkrz.de)

  • the network connection between the issuing host and your machine

  • your machine (by e.g. going to sleep)

Therefore, it is better to script the work that should be done and submit the script for execution with the sbatch command. The script example below can be used as a basis for execution of serial applications in batch mode:

#!/bin/bash
#SBATCH -J myjob           # Specify job name
#SBATCH -p shared          # Use partition shared
#SBATCH -N 1               # Specify number of nodes (1 for serial applications!)
#SBATCH -n 1               # Specify max. number of tasks to be invoked
#SBATCH -t 01:00:00        # Set a limit on the total run time
#SBATCH -A xz0123          # Charge resources on this project account
#SBATCH -o myjob.o%j       # File name for standard and error output

set -e

module load python3

echo "Start python script execution at $(date)"

python -u /path/to/myscipt.py

Please, insert your own job name (-J), project account (-A), file name for standard output and error output (-o), number of tasks (-n), and commands to be executed. The --mem or --mem-per-cpu options can be used to request higher amount of memory for your job if needed.