Levante is not yet fully available. Please check regularly which limitations to expect.
When running programs on Levante, various settings might be needed to achieve satisfying performance, or, in some cases, even to allow a program to run. This section describes some environment settings which are most often needed to be set in Slurm batch scripts.
MPI Runtime Settings¶
Modern MPI library implementations provide a large number of user-configurable parameters and algorithms for performance tuning. Although the local configuration of MPI libraries is initially performed by vendor to match the characteristics of the cluster, the performance of a specific application can often be further improved by up to 15% by optimal choice of tunable parametes.
Since tuning options are specific to an MPI library and application, the recommendation for MPI runtime setting below are just a starting point for each version.
Open MPI 4.0.0 and later¶
As a minimal environmental setting we recommend the following to make use of the UCX toolkit. This is just a starting point, users will have to tune the environment depending on the used application.
export OMPI_MCA_pml="ucx" export OMPI_MCA_btl=self export OMPI_MCA_osc="pt2pt" export UCX_IB_ADDR_TYPE=ib_global # for most runs one may or may not want to disable HCOLL export OMPI_MCA_coll="^ml,hcoll" export OMPI_MCA_coll_hcoll_enable="0" export HCOLL_ENABLE_MCAST_ALL="0" export HCOLL_MAIN_IB=mlx5_0:1 export UCX_NET_DEVICES=mlx5_0:1 export UCX_TLS=mm,knem,cma,dc_mlx5,dc_x,self export UCX_UNIFIED_MODE=y export HDF5_USE_FILE_LOCKING=FALSE export OMPI_MCA_io="romio321" export UCX_HANDLE_ERRORS=bt
ompi_info tool can be used to get detailed information about OpenMPI
installation and local configuration:
Environment variables for Intel MPI start with an
I_MPI_ prefix. The complete reference of environment variables can be found at Intel’s site.
On Levante, to run programs built with Intel MPI, you should set at least
the following environment variables:
export I_MPI_PMI=pmi export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
For large jobs we recommend to use PMI-2 instead of PMI. The corresponding settings are:
export I_MPI_PMI=pmi2 export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi2.so srun --mpi=pmi2 ...
unlimited size stack might have a negative influence on
performance. Also, an unlimited stack hides invalid memory
accesses. Therefore it’s recommended to define the actually needed
amount. For example, to set the limit for stack size to 200MB (200*1024)
use one of the following statements:
ulimit -s 204800 # bash limit stacksize 204800 # tcsh
It might be necessary to further increase the stack size if your program uses large automatic arrays. If the stack size is too small the program usually will crash with an error message like this:
"Caught signal 11 (Segmentation fault: address not mapped to object at address 0x0123456789abcdef)".
Obviously, the actual address will vary. If increasing the stack size
does not resolve the program abort, a
Segmentation fault error is a
strong indication for a bug in your program.
Core File Size¶
It is also recommended to disable core file generation unless needed for debugging purposes:
ulimit -c 0 # bash limit core 0 # tcsh
All current limits can be listed with the following command:
ulimit -a # bash limit # tcsh