Warning

Levante is not yet fully available. Please check regularly which limitations to expect.

Partitions and Limits

In SLURM multiple nodes can be grouped into partitions which are sets of nodes with associated limits for wall-clock time, job size, etc. These limits are hard limits for the jobs and can just be overruled by QOS (quality of service). The defined partitions can overlap, i.e. one node might be contained in several partitions.

Jobs are allocations of resources by users in order to execute tasks on the cluster for a specified period of time. Furthermore, the concept of job steps is used by SLURM to describe a set of different tasks within the job. One can imagine job steps as smaller allocations or jobs within the job, which can be executed sequentially or in parallel during the main job allocation.

The SLURM sinfo command lists all partitions and nodes managed by SLURM on Levante as well as provides general information about the current nodes’ status:

$ sinfo

PARTITION   AVAIL  TIMELIMIT  NODES  STATE NODELIST
compute        up    8:00:00   2200  alloc l[10000-10032,...,40322-40341]
compute        up    8:00:00     32   idle l[40167-40174,40178-40183,40190-40195,40348-40359]
shared         up 7-00:00:00      4  alloc l[10060,10063,10066,10069]
gpu          down   12:00:00      4   idle l[40360,40363,40366,40369]
visualize    down   12:00:00      4  drain lg[0-3]
interactive  down   12:00:00      1   resv l10169
interactive  down   12:00:00      3  alloc l[10160,10163,10166]

For detailed information about all available partitions and their limits use the SLURM scontrol command as follows:

$ scontrol show partition

The following partitions are currently defined on Levante:

compute

This partition consists of 2659 AMD EPYC 7763 Milan compute nodes and is intended for running parallel scientific applications. The compute nodes allocated for a job are used exclusively and cannot be shared with other jobs.

shared

This partition is defined on 10 nodes and can be used to run small jobs not requiring a whole node for the execution, such that one compute node can be shared between different jobs. The partition is dedicated for execution of shared memory applications parallelized with OpenMP or pthreads as well as for serial and parallel data processing jobs which need a considerable longer allocation periode than usual compute jobs.

interactive

The interactive partition is made up of 10 nodes but can be dynamically expanded if there is a short-term need. It is intended for memory or compute intensive data processing and compilation tasks that should not run on the login nodes. Nodes of this partition can be shared with other jobs if a single job does not allocate all resources. Use salloc to allocate the resources and directly jump to that node. Basically, this partition should not have any waiting times. The total amount of ressources per user in this partition is limited to an equivalent of 2 nodes.

gpu

The 60 nodes in this partition are each equipped with 2 AMD EPYC Milan 7713 CPUs and additional 4 Nvidia A100 GPUs. These can be used for GPGPU-aware scientific applications (e.g. via OpenACC programming) or interactive 3-dimensional data visualization via VirtualGL/TurboVNC.

visualize

There are 4 dedicated nodes for interactive 3-dimensional data visualization. Access is granted only on request.

Warning

The following limits might be changed in the first months of levante general availability.

The SLURM limits configured for different partitions are:

Partition Name

Max Nodes per Job

Max Job Runtime

Max resources used simultaneously

Shared Node Usage

Default Memory per CPU

Max Memory per CPU

compute

512

8 hours

no limit

no

1920 MB

8000 MB

shared

1

7 days

no limit

yes

1920 MB

1920 MB

interactive

1

12 hours

256 CPUs

yes

1920 MB

1920 MB

gpu

60

12 hours

no limit

yes

1920 MB

4000 MB

Hint

If your jobs require either longer execution times or more nodes, contact DKRZ Help Desk. The predefined limits can be adjusted for a limited time to match your purposes by specifying an appropriate Quality of Service (QOS). Please, include the following information in your request: the reason why you need higher limits, what limits to increase, and for how long those should be increased. Also a brief justification by your project admin is needed.

CAUTION: All jobs on levante have to be assigned to a partition - there is no default partition available. Choosing the partition can be done in various ways

  • Environment variable

export SBATCH_PARTITION=<partitionname>
  • Batch script option

#SBATCH [-p|--partition=]<partitionname>
  • Command line option

sbatch [-p|--partition=]<partitionname>

Note that an environment variable will override any matching option set in a batch script, and command line option will override any matching environment variable.

To control the job workload on the levante cluster and keep SLURM responsive, we enforce the following restrictions regarding the number of jobs:

SLURM Limits

Max Number of Submitted Jobs

Max Number of Running Jobs

Per User and Account

1000

20

If needed, you can ask for higher limits by sending a request with a short justification to support@dkrz.de. Based on the technical limitations and a fair share between all users, we might then arrange a QOS for some limited time.

To list job limits and quality of services relevant to you, use the sacctmgr command, for example:

sacctmgr -s show user $USER

sacctmgr -s show user $USER format=user,account,maxjobs,maxsubmit,qos