Warning

Levante is not yet fully available. Please check regularly which limitations to expect.

MPI Runtime Settings

Modern MPI library implementations provide a large number of user-configurable parameters and algorithms for performance tuning. Although the local configuration of MPI libraries is initially performed by vendor to match the characteristics of the cluster, the performance of a specific application can often be further improved by up to 15% by optimal choice of tunable parametes.

Since tuning options are specific to an MPI library and application, the recommendation for MPI runtime setting below are just a starting point for each version.

OpenMPI based MPI libraries

OpenMPI 4.0.0 and later

As a minimal environmental setting we recommend the following to make use of the UCX toolkit. This is just a starting point, users will have to tune the environment depending on the used application.

export OMPI_MCA_pml="ucx"
export OMPI_MCA_btl=self
export OMPI_MCA_osc="pt2pt"
export UCX_IB_ADDR_TYPE=ib_global
# for most runs one may or may not want to disable HCOLL
export OMPI_MCA_coll="^ml,hcoll"
export OMPI_MCA_coll_hcoll_enable="0"
export HCOLL_ENABLE_MCAST_ALL="0"
export HCOLL_MAIN_IB=mlx5_0:1
export UCX_NET_DEVICES=mlx5_0:1
export UCX_TLS=mm,knem,cma,dc_mlx5,dc_x,self
export UCX_UNIFIED_MODE=y
export HDF5_USE_FILE_LOCKING=FALSE
export OMPI_MCA_io="romio321"
export UCX_HANDLE_ERRORS=bt

The ompi_info tool can be used to get detailed information about OpenMPI installation and local configuration:

ompi_info --all

All MPIs

Unlimited stack size might have negative influence on performance - better use the actually needed amount, e.g.

ulimit -s 102400       # using bash

It is also recommended to disable core file generation if it is not needed for debugging purposes.

ulimit -c 0    # using bash