Data Processing on Levante#
A part of the Levante cluster is reserved for interactive data
processing and analysis. These hardware resources form the Slurm
partition named interactive. They can not be directly accessed
via ssh
, but should be reserved using the Slurm salloc
command
from one of the login nodes (levante.dkrz.de). This approach prevents
programs of individual users from impacting the sessions of other
users as was often the case on the mistralpp nodes of the HLRE-3 system.
How to start an interactive session#
Starting an interactive session is accompished with a single command, that reserves resources and jumps directly to the assigned node:
$ salloc --x11 -p interactive -A xz0123 -n 1 -t 240
The option --x11
in the above command sets up X11 forwarding
needed to use GUI applications. Please, take care to adapt the
settings like project account (-A
), number of tasks (-n
),
wall-clock time limit (-t
) to your actual needs.
To receive more memory for your session, you can either increase the number of tasks:
$ salloc --x11 -p interactive -A xz0123 -n 10 -t 240
Or, you can use the --mem
or --mem-per-cpu
options of the salloc
command, for example:
$ salloc --x11 -p interactive -A xz0123 --mem=5760 -t 240
For further details please refer to the salloc(1) manual page:
$ man salloc
The name of the allocated node is set by SLURM in the environment variable SLURM_JOB_NODELIST:
$ echo $SLURM_JOB_NODELIST
l10160
As long as your job allocation is active, you can open further
interactive sessions on the allocated node using the ssh
command
from a login node:
$ ssh -X l10160
An interactive session can be terminated by invoking the exit
command from the shell spawned by salloc
.
Submitting a batch job#
Interactive sessions started with the salloc
command should only be
used if your data processing steps really require interactive control
on your part. Usually, interactive sessions will be suddenly
terminated if one of the following fails:
the issuing host (levante.dkrz.de)
the network connection between the issuing host and your machine
your machine (by e.g. going to sleep)
Therefore, it is better to script the work that should be done and
submit the script for execution with the sbatch
command. The
script example below can be used as a basis for execution of serial
applications in batch mode:
#!/bin/bash
#SBATCH -J myjob # Specify job name
#SBATCH -p shared # Use partition shared
#SBATCH -N 1 # Specify number of nodes (1 for serial applications!)
#SBATCH -n 1 # Specify max. number of tasks to be invoked
#SBATCH -t 01:00:00 # Set a limit on the total run time
#SBATCH -A xz0123 # Charge resources on this project account
#SBATCH -o myjob.o%j # File name for standard and error output
set -e
module load python3
echo "Start python script execution at $(date)"
python -u /path/to/myscipt.py
Please, insert your own job name (-J
), project account (-A
),
file name for standard output and error output (-o
), number of
tasks (-n
), and commands to be executed. The --mem
or
--mem-per-cpu
options can be used to request higher amount of
memory for your job if needed.