Posted in 2024

How to get more memory for my Slurm job

The amount of memory specified on the Levante configuration page for different node types refers to the total physical memory installed in a node. Since some memory is reserved for the needs of the operating system and the memory-based local file system (e.g. /tmp, /usr), the amount of memory actually available for job execution is less than the total physical memory of a node.

The table below provides numbers for the preset amounts of physical memory (RealMemory), memory reserved for the system (MemSpecLimit) and memory available for job execution (which is the difference between RealMemory and MemSpecLimit) for three Levante node variants:

Read more ...


Slurm-managed cronjobs

To execute recurring batch jobs at specified dates, times, or intervals, you can use the Slurm scrontab tool. It provides a reliable alternative to the traditionally used cron utility to automate periodic tasks on Levante.

To define the recurring jobs, Slurm uses a configuration file, so-called crontab, which is handled using the scrontab command. The scrontab command with the -e option invokes an editing session, so you can create or modify a crontab:

Read more ...


HSM module cleanup April 2024

On 30 April 2024 many modules of old slk and slk_helpers version will be removed from Levante.

modules which will be removed:

Read more ...


Changelog slk_helpers v1.12.10

Update from slk_helpers v1.10.2 to 1.12.10

see here for changes from slk_helpers v1.9.7 to v1.10.2

Read more ...


HSM: check storage location of files

From time to time connection issues between StrongLink and one of the tape libraries occur. If the retrieval of a file fails repeatedly in such a situation, then it is useful to be able to check whether the file is stored on a tape in the affected library. For this purpose we provide the commands slk_helpers resource_tape and slk_helpers tape_library.

Example:

Read more ...


Keeping disk usage in /home under control

Sufficient free storage space in your /home directory is required to run many software packages successfully. This is because the home directory is used to store various small files during program execution, such as configuration files, log files, lock files, named pipes, and Unix sockets. If these files cannot be created or written, an exhausted home quota may lead to a wide range of seemingly unrelated error messages and issues. One example is the failure to start a JupyterHub session. Regularly monitoring disk space usage in your home directory is therefore essential for the smooth use of the Levante HPC system.

You can check disk usage and limits for your /home directory on Levante using the following command:

Read more ...