Using GPU nodes#

Overview#

An increasing number of scientific software relevant to earth system research can take advantage of GPGPU (General Purpose computing on Graphics Processing Units) technology. Therefore, a considerable part of the Levante cluster consists of nodes equipped with GPUs. Such nodes make up the Slurm partition named gpu, accordingly. They are intended for the following use cases:

Production runs with GPU-ready model codes
Porting and development for GPUs of model codes not yet adapted
Using of ML methods for data processing and analysis
3D visualization benefiting from hardware acceleration

All Levante GPU nodes are equipped with 4 GPUs of type NVIDIA A100-SXM4 (Ampere architecture). Four nodes from Levante phase one have 40GB memory per GPU, the rest has 80GB. The table below gives an overview of the node configuration.

GPUs	CPU arrangement	#Nodes	GPU type tag/ feature
4x Nvidia A100 40GB/GPU	2x AMD 7763 CPU; 128 cores in total, 512 GB/1024 GB main memory	2/2	`a100_40`
4x Nvidia A100 80GB/GPU	2x AMD 7763 CPU; 128 cores in total, 512 GB main memory	56	`a100_80`

To inquire more details about GPU hardware, you can allocate a GPU node with:

$ salloc -p gpu --gpus=4 -A <prj-account>

and execute the following commands:

$ nvidia-smi -q
$ module load nvhpc/22.5-gcc-11.2.0
$ nvaccelinfo

In addition to the nodes in the GPU partition there is a small number of nodes equipped with 2 NVIDIA A100, but without the NVLINK interconnect between the GPUs. These nodes can be allocated via the gpu-devel partition and are meant for testing and debugging.

The GPU partition#

Levante’s GPU nodes can be accessed via the Slurm parition gpu:

#SBATCH --partition=gpu

Note

Only projects with an allocation for the GPU resource can submit jobs to the gpu partition. Read more about requesting the GPU resource.

Individual GPUs of nodes can be requested separately, therefore you must specify the number of GPUs needed explicitly, using the --gpus option:

#SBATCH --gpus=1

There are various ways to specify the GPU resources for your job, which are documented in the Slurm documentation. We recommend to use one of the following:

Slurm option	Comment
`--gpus=[type]:count`	Number of GPUs per job
`--gpus-per-node=[type]:count`	Number of GPUs per node
`--gpus-per-task=[type]:count`	number of tasks must be known

Here, count and type have to be replaced accordingly. The optional argument type can be any of the GPU type tag listed above. However, we observed unexpected behavior when using this option, therefore we recommend to skip it. In case the exact GPU type matters for your job, better use --constraint= and specify the nodes’ feature. The GPU nodes’ features are listed in the table at the beginning of this page. When no GPU type is specified, Slurm will assign the hardware available first to your job.

To start a job with two tasks and one GPU per task you could for example define it as follows:

#SBATCH --ntasks=2
#SBATCH --gpus-per-task=1

If all 4 GPUs of a node are to be used, the most flexible way to make all devices visible is to use the --exclusive option:

#SBATCH --exclusive

With this method, more complicated setups (e.g. 5 MPI tasks of which only 4 use GPUs) can be handled.

Basic example Slurm scripts using GPUs can be found in Example Batch Scripts.