Known issues#
This is a collection of current soft- and hardware problems and limitations on Levante. Where possible, status and workarounds are given.
Current Issues#
Slurm - Connection issues between slurmd and slurmctld leading to job loss#
User jobs are sometimes cancelled or aborted by Slurm due to internal communication timeout errors. This might be fixed with a Slurm update planned for mid-February.
Heterogeneous jobs do not work properly#
Currently, startup of so-called hetjobs (see Slurm documentation) is not working reliably. We plan to make an update of Slurm to solve these issues.
Resolved Issues#
The following blog entries describe resolved issues on Levante.
Bus error in jobs on February 11, 2022
Update 2022-06-14: The problem was solved by an update of the Lustre-client by our storage vendor. The workaround described below should no longer be necessary. If one of your jobs runs into a bus error, please let us know.
When running jobs on Levante, these sometimes fail with a Bus error, similar to the example below:
libGL.so.1 missing on April 22, 2022
Update 2022-10-17: This problem should be fixed with our current software stack. The workaround is not required any longer.
When trying to start a gui application like gvim, you get an error message: