Example Batch Scripts#
Two-way simultaneous multithreading (SMT) is enabled on all Levante nodes i.e. the operating system recognizes 256 logical CPUs per node, while there are only 128 physical cores. In most cases, it is advisable to not employ the simultaneous threads for the application, but to leave them for the operating system.
Below examples of batch scripts for the following use cases are provided:
MPI job without simultaneous multithreading#
The overall setting of the batch script does not vary whether one is using
IntelMPI or OpenMPI (or any other MPI implementation). Specific environment
variables should be set in order to
fine-tune the used MPI. Especially,
the parallel application should always be started
using the srun
command instead of invoking mpirun
, mpiexec
or others.
In the following examples 12*128 cores are used to execute a parallel program.
#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --partition=compute
#SBATCH --nodes=12
#SBATCH --ntasks-per-node=128
#SBATCH --exclusive
#SBATCH --time=00:30:00
#SBATCH --mail-type=FAIL
#SBATCH --account=xz0123
#SBATCH --output=my_job.%j.out
# limit stacksize ... adjust to your programs need
# and core file size
ulimit -s 204800
ulimit -c 0
# Replace this block according to https://docs.dkrz.de/doc/levante/running-jobs/runtime-settings.html#mpi-runtime-settings
echo "Replace this block according to https://docs.dkrz.de/doc/levante/running-jobs/runtime-settings.html#mpi-runtime-settings"
exit 23
# End of block to replace
# Use srun (not mpirun or mpiexec) command to launch
# programs compiled with any MPI library
srun -l --cpu_bind=verbose --hint=nomultithread \
--distribution=block:cyclic ./myprog
Note: --hint=nomultithread
cannot be used in conjunction with
--ntasks-per-core
, -threads-per-core
and --cpu-bind
.
(--cpu-bind=verbose
is allowed though.)
Please also read the section compiling and linking MPI programs on Levante.
Hybrid (MPI/OpenMP) job without simultaneous multithreading#
The following job example will allocate 4 compute nodes from the compute partition for 1 hour. The job will launch 32 MPI ranks per nodes, and 4 OpenMP threads per rank. On each node all 128 physical cores will be used.
#!/bin/bash
#SBATCH --job-name=my_job # Specify job name
#SBATCH --partition=compute # Specify partition name
#SBATCH --nodes=4 # Specify number of nodes
#SBATCH --ntasks-per-node=32 # Specify number of (MPI) tasks on each node
#SBATCH --time=01:00:00 # Set a limit on the total run time
#SBATCH --mail-type=FAIL # Notify user by email in case of job failure
#SBATCH --account=xz0123 # Charge resources on this project account
#SBATCH --output=my_job.o%j # File name for standard output
# Bind your OpenMP threads
export OMP_NUM_THREADS=4
export KMP_AFFINITY="verbose,granularity=fine,scatter"
export KMP_LIBRARY="turnaround"
# limit stacksize ... adjust to your programs need
# and core file size
ulimit -s 204800
ulimit -c 0
export OMP_STACKSIZE=128M
# Replace this block according to https://docs.dkrz.de/doc/levante/running-jobs/runtime-settings.html#mpi-runtime-settings
echo "Replace this block according to https://docs.dkrz.de/doc/levante/running-jobs/runtime-settings.html#mpi-runtime-settings"
exit 23
# End of block to replace
# Use srun (not mpirun or mpiexec) command to launch
# programs compiled with any MPI library
srun -l --cpu_bind=verbose --hint=nomultithread \
--distribution=block:cyclic:block ./myprog
Serial job#
#!/bin/bash
#SBATCH --job-name=my_job # Specify job name
#SBATCH --partition=shared # Specify partition name
#SBATCH --mem=10G # Specify amount of memory needed
#SBATCH --time=00:30:00 # Set a limit on the total run time
#SBATCH --mail-type=FAIL # Notify user by email in case of job failure
#SBATCH --account=xz0123 # Charge resources on this project account
#SBATCH --output=my_job.o%j # File name for standard output
set -e
ulimit -s 204800
module load python3
# Execute serial programs, e.g.
python -u /path/to/myscript.py
The shared partition has a limit of 960 MB memory per CPU. In case
your serial job needs more memory you have to increase the amount of
memory (--mem
) accordingly. Slurm will automatically increase the
number of CPUs allocated for the job.