FAQ#

v1.43, 08 Dec 2023

General information about the HSM system#

What does HSM mean?#

Hierarchical Storage Management. It means the DKRZ tape archive.

What type of HSM system is used at DKRZ?#

The software is called StrongLink.

Would it be possible/desirable to use only one command for slk and slk_helpers main classes?#

Basically: yes. We decided otherwise. slk is developed by the StrongLink developers and slk_helpers are developed by the DKRZ staff. A command which we implement in the slk_helpers might be replaced by an official slk command later on. It might cause problems if one command would be replaced by another one with the same name. Additionally, the development cycle of slk is much slower than that of slk_helpers. Thus, users need to live with slk as it is while we can quickly publish bugfixes and other changes for slk_helpers.

Training, Questions and Adaption of Workflows#

Has there been an introduction session to the new HSM system and will there be such sessions in future?#

Yes, two DKRZ Tech Talks took place on 6 July 2021 and 12 July 2023. The new HSM system and the new command line tool slk and hints for retrievals were presented there, respectively. Recordings of the Tech Talks available on YouTube:

Where can I find written documentation about the new HSM system?#

The user documentation is available at https://docs.dkrz.de.

Who do I contact when I have questions or issues regarding the new HSM system and its usage?#

Please contact us via support@dkrz.de.

Archiving and Retrieval#

How do I interact with the new system?#

The official command line tool for tape access is called slk. Additionally, the slk is missing a few small but very useful features. Therefore, a tool called slk_helpers was written at DKRZ to add these features. Details on these two tools are provided in the HSM Documentation at https://docs.dkrz.de . Additionally, a Python wrapper pyslk exists and packems was adapted to slk.

Where can I use slk and slk_helpers?#

slk / slk_helpers are be installed as module slk on all Lenvate nodes. There are no limitations for the usage of slk commands except for retrieve and archive.

On Levante login nodes: slk retrieve can only retrieve one file at once. Please use slk archive only for a few small files.

If you wish to use slk archive or slk retrieve interactively, please use the interactive partition and allocate 6 GB of memory for this purpose (via salloc --mem=6GG, see: Run slk in the “interactive” partition and Data Processing on Levante). For non-interactive archival and retrieval tasks, we recommend using the compute and shared partitions. We strongly recommend allocating 6 GB of memory in your batch scripts (--mem=6GB, see: Run slk as batch job, details: slk memory footprint). If your slk is killed with a message like /sw/[...]/bin/slk: line 16: [...] Killed, then please inform the DKRZ support (support@dkrz.de) and allocate 8 GB or 10 GB of memory.

How do I login to the HSM system?#

Login is done via the command line tool slk login using your DKRZ credentials (LDAP; like used for luv). The command line tool stores a login token in ~/.slk/config.json for a specific time period (currently 30 days) so that you do not even need to go through the process of logging in for that period.

Please have a look into How do I automatically/non-interactively check whether I own a valid slk login token? when you would like to be reminded when the token is due to expire.

Does Kerberos authentication work on the new HSM system?#

No, Kerberos does not work anymore. You only need to provide your login data to the command line tool in certain time intervals (currently 30 days).

Do I have to provide my login credentials each time I use the command line tool?#

No, a login token is generated at first login. This token is valid for a fixed period of time (currently 30 days) and will then have to be renewed by performing a login operation. You do not need to wait for 30 days. But, the login token can be renewed at any time you wish. It is stored in ~/.slk/config.json.

Please have a look into How do I automatically/non-interactively check whether I own a valid slk login token? when you would like to be reminded when the token is due to expire.

Can I use the command line tool non-interactively?#

Yes, it can be used non-interactively when a login token exists. From time to time an interactive session of the command line tool is necessary in order to renew the login token. The command line tool returns proper exit codes so that the success or failure of a program call can automatically be evaluated. Some slk commands write there output into the terminal buffer and not to stdout or stderr. Thus, no output could be capured in non-interactive jobs – only exit codes. Amongst others, slk retrieve and slk archive are affected. slk archive -vv writes to stdout.

Can I access archived data from outside the DKRZ?#

Currently, data in tape archive can only be accessed via Levante.

Do I have write access to the archive from outside the DKRZ?#

No. slk is not made for data transfer via the internet.

Will the new system be available as a Globus endpoint for external transfers?#

No, not at the moment and not in the near future.

How do I create directories in the HSM?#

slk archive automatically creates needed target folders. If you want to create a directory manually, please use slk_helpers mkdir /ex/am/ple/dir or slk_helpers mkdir -R /ex/am/ple/dir. If you want to create several nested folders (like mkdir -p does) please use slk_helpers mkdir -R <new namespace> – without -R only the right-most directory will be created. If you do not want to use the slk_helpers, please do as follows: create empty directories locally, fill them with non-empty dummy files and archive them via slk archive -R.

Do I manually need to check the integrity of archived and retrieved files?#

Please see “Does StrongLink automatically check the integrity of archived and retrieved files?

Is there an option to continue archiving if it was interrupted?#

Yes. If the archival of several files was interrupted, please run the same call of slk archive a second time. The slk archive will only transfer those files, which (a) have not already been archived, (b) have only been partly archived or (c) have been modified since the first archival. The files, which are partly archived, are those files, which were currently transferred when slk archive was killed. Please be aware that these incomplete files are listed by slk list and may even have a checksum (of the incomplete file). Therefore, please check with slk_helpers has_no_flag_partial whether StrongLink flagged these files as partial files and notify us via support@dkrz.de when the flag remains even after slk archive finished successful. Alternatively, you could compare the checksum of the original file with the checksum calculated by StrongLink (slk_helpers checksum ...).

Details: Validate archivals

Does any command exist for deleting files immediately from /work in case of successful archival?#

No, such a tool does not exist. We currently do not plan to provide such a tool.

Is it possible to archive into my existing folder structure created on HPSS?#

Yes, the folder structure and write permissions remained untouched. Except the root folder /hpss was dropped.

Is there a “double” storage feature as for HPSS?#

Yes, there is a “double” storage feature. Please see the chapter “Storage options and quota” in the new HSM documentation for details.

What does “namespace”, “global namespace” or “gns” mean?#

StrongLink uses the term “namespace” or “global namespace” (=”gns”). A “(global) namespace” is comparable to a “directory” or “path” on a common file system.

How do I automatically/non-interactively check whether I own a valid slk login token?#

You can check the validity of your login token via slk_helpers session. slk does not provide a command that returns the status of the login tokes as true/false, valid/invalid. If you do not want to use the slk_helpers but check the status of the login token anyway, please use this command:
$ test `date -d "$(jq .expireDate ~/.slk/config.json | sed 's/"//g')" +'%s'` -gt `date +%s`

$? will be 0 if login token is valid and 1 if not.

You need to have the program jq available which is the case on Levante.

If you wish to be reminded when your token is due to expire, you can submit a SLURM script which does this for you. A script for this purpose and a description are given in Reminder login token expires.

Is my slk login token still valid?#

How to I check for how long my login token is still valid?#

slk_helpers session will print the expiration date. Alternatively, the date/time until when the login token is stored in the slk config file (~/.slk/config.json). The key is expirationDate. You might open the config file with a text editor or print its content with tools like cat, less or jq.
jq .expireDate ~/.slk/config.json

You need to have the program jq available. jq is installed on Levante and available without loading any package.

If you wish to be reminded when your token is due to expire, you can submit a SLURM script which does this for you. A script for this purpose and a description are given in Reminder login token expires.

Can I provide a file list to “slk archive” such as “-T” for “tar”?#

Currently, this is not possible. We do not expect this feature to be added to slk archive

Can a user run multiple archival and retrieval requests at a time?#

Yes, that is possible. However, we strongly recommend running only one slk call per 6 GB of allocated memory (see slk memory footprint). The transfer rate of slk archive and slk retrieve might be hardware-limited on shared nodes when other users copy data as well (see slk data transfer rate). Hence, splitting archival/retrieval requests up into multiple ones does not necessarily increase the transfer speed. We recommend aggregating file retrievals.

Where on Levante should I run slk?#

Please see Where can I use “slk” and “slk_helpers”?

How does slk archive the files: does it tar them itself (similar to packems) or should we tar the files before hand?#

slk does not packs/tar files. Metadata from netCDF files is automatically imported into the StrongLink database to simplify search and retrieval later on. Direct archiveal of nc-files is preferable with respect to the metadata import feature. However, many small files are bad for tape performance and might cost additional storage space (see Storage options and quota. Therefore, the usage of packems is reasonable in the case of large amount of very small files.

Are there requirements on the file size for the tape archival?#

  • Preferred file size: 10 GB to 200 GB. Slower transfer rate when files larger than 200 to 250 GB.

  • Each file smaller than 1 GB will be charged 1 GB.

  • Lower size limit: small files are not optimal for tape storage. Therefore, we encourage users to pack small files if there is no need to use the netCDF metadata features of StrongLink (see File Search and Metadata).

  • Upper size limit: file sizes of a few TB are possible and have been successfully tested, but we recommend the same sizes as for HPSS: max. 500 GB.

I am member of a project but cannot access this projects data?#

If you were added to the project recently, please login again via slk login. For details please see group memberships of user updated on login.

If you are member in the particular project for a long time or have followed the previous instructions, please let another user of the project check whether the group permissions are properly set.

If both approaches do not work, please contact support@dkrz.de

Why do I get “Exception …: lateinit property websocket has not been initialized”?#

When running slk archive with the argument --streams N please do only use values between 1 and 4` for N. For details please see slk archive: Exception …: lateinit property websocket has not been initialized.

My slk archive seems to hang. What should I do?#

Please check whether /home is hanging. If /home is hanging, slk cannot access its login token and cannot write into its log. Therefore, slk hangs when /home is hanging. You might also run slk archive with -v or -vv to see the progress.

How much data can I archive at once?#

We suggest to archive not more than approximately 5 TB with one call of slk archive. If you archive more than that and the StrongLink system is busy, the transfer might be interrupted unexpectely. The slk log (~/.slk/slk-cli.log) will show this error (look for unexpected end of stream on https://archive.dkrz.de/...):

2022-11-24 11:16:22 INFO  Executing command: "archive -R /work/ab1234/c567890/much_data /arch/zy0987/c567890/target
2022-11-24 11:18:25 ERROR Unexpected exception
java.io.IOException: unexpected end of stream on https://archive.dkrz.de/...
        at
okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:202)
~[slk-cli-tools-3.3.21.jar:?]
[...]
[...]
[...]
        ... 16 more
2022-11-24 11:18:25 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 0/85083 files [0B/20.3T]

If you want to keep archiving larger amounts than 10 TB at once – e.g. 100 TB –, please be prepared to run slk archive repeatedly. In the end, a summary similar to this one should be printed to the log:

2022-11-25 11:21:10 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 0/85083 files [0B/20.3T]
Total files skipped: 85083/85083 files [20.3T/20.3T]
        Unchanged files: 85083

or this one:

2022-11-25 09:46:55 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 4342/85083 files [1.2B/20.3T]
Total files skipped: 80741/85083 files [20.3T/20.3T]
    Unchanged files: 80741

What does the error “unexpected end of stream on https://archive.dkrz.de/…” mean?#

This error is not printed to the command line but into the slk log. The full error looks like this:

2022-11-24 11:16:22 INFO  Executing command: "archive -R /work/ab1234/c567890/much_data /arch/zy0987/c567890/target
2022-11-24 11:18:25 ERROR Unexpected exception
java.io.IOException: unexpected end of stream on https://archive.dkrz.de/...
        at
okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:202)
~[slk-cli-tools-3.3.21.jar:?]
[...]
[...]
[...]
        ... 16 more
2022-11-24 11:18:25 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 0/85083 files [0B/20.3T]

This error might be thrown if a large amount of data is archive (approximately more than 5TB to 10TB), while the StrongLink system is under high load due to many user archivals and retrievals.

The output 0/85083 is not necessarily correct and some files might have been archived. The slk client does not get confirmation of successful archival before it stops due to an interrupted data stream. The philosophy of slk in this context is: better an error is displayed, although the file was archived correctly, than no error is displayed, although the file was not archived correctly.

Additional features#

Which new features does the HSM System provide?#

Extended metadata is harvested from netCDF files which (a) have been archived into StrongLink and (b) were archived into HPSS but have been retrieved recently. It is possible to search files based on these extended metadata using slk search. Details on the search feature: File Search and Metadata.

From which file types are extended metadata harvested?#

Harvesting from netCDF files is implemented in StrongLink. Additionally, metadata is harvested from common picture, video, audio and document file formats (details in Reference: metadata schemata). Further formats are being investigated and could be introduced later after functional issues of slk are fixed. We don’t expect that this will happen soon.

Which metadata fields are harvested from netCDF files?#

Most global attributes and variable names of netCDF files are stored in a metadata database. It is possible to search for each of these global attributes. Hence, properly self-described and standardized files are easier to find later on. These metadata are read-only. Metadata from a standardized subset of global attributes are be copied into an indexed metadata database. These can be modified and searched more efficiently. Please see the DKRZ documentation page on metadata schemata for details.

Is there a python interface available?#

Yes, we offer a python wrapper package called pyslk. It is installed in python3/2022.01-gcc-11.2.0, python3/2023.01-gcc-11.2.0 and python3/unstable on Levante. The latter module might contain a newer pyslk version which is in the final testing phase. The slk module as to be loaded when pyslk is used. See for details please see https://hsm-tools.gitlab-pages.dkrz.de/pyslk .

Is it possible to use slk chmod and slk group (=chgrp) commands recursively by the user?#

Yes, it is possible. Please provide -R to apply these commands recursively.

Are the search IDs user specific?#

No, the search IDs are assigned globally. E.g. the search ID 423 exists only once. Each search ID can be used by every user. Thus, you can share your search IDs with your colleagues. However, the output of slk list SEARCH_ID or retrieval of slk retrieve SEARCH_ID ... depends on the read permissions of the executing user.

How long are the search IDs stored?#

The search results behind search IDs are stored for at least a few weeks. They are cleaned up depending on different parameters.

Is a search ID automatically updated when new files are archived which match the original search query?#

No, the IDs of files matching the search query are stored once when the search is performed. This list of file IDs will not be updated afterwards – except if files on the list are deleted. However, file specific metadata, such as file size or permission, are retrieved at the time when the search ID is used. slk list SEARCH_ID will show todays sizes of files covered by the search ID SEARCH_ID. Files that first matched the a search query are still listed by slk list even if they no longer match the original search query. This might happen if a file is renamed.

Can I share my search’s search ID with other DKRZ users?#

Yes, you can. Please see “Are the search IDs user specific?” for details.

What does “RQL” mean?#

RQL abbreviates “resource query language” and is another name for the “StrongLink Query Language”. Please see StrongLink query language Reference and File Search and Metadata for details.

Is there any possibility to move around in the file system with something like the cd command?#

No, this is not possible. The slk does not start its own shell like pftp or pure ftp do. It rather works like scp or rsync.

When slk list shows a file with “-” (not “t”) which means it exists at the cache: Does that mean it is not yet on the tape?#

Right now it means that the file is in the cache. It can be on the tape – but not necessarily. If the t is shown, it means the file is only on tape - we are trying to show the duality at some point.

For a better overview of the archived files, Is there a possibility to list only folders, not all files?#

When you use slk list with a specific namespace path, it shows all the files and namespaces in that specific namespace. It does not provide an argument like -d as ls does. You might use slk list GNS_PATH | grep -E "^d" to print only folders.

Alternatively, slk_helpers list_search -d SEARCH_ID can be used to print all namespaces that have been found by the search SEARCH_ID. Details on searches: File Search and Metadata

Is it possible to remove files from the archive?#

Yes you can use slk delete for removing files and slk delete -R for removing namespaces and their content. Using wildcards or Regular Expressions with slk delete is not possible.

How to print the version of slk?#

Please run slk version to print the version of slk. A --version flag or similar does not exist.

How to search non-recursively in a namespace?#

By default, slk search searches recursively in a namespace provided via path. To do a non-recursive search, please add "$max_depth": 1 to the JSON expressions as follows: {"path": {"$gte": "/ex/am/ple/path", "$max_depth": 1}}. Alternatively, please get the object id of the particular namespace via slk_helpers exists and, then in your search query, use it as value for the search field resources.parent_id (see slk Usage Examples)

Is it possible to move files within the archive?#

Yes you can use slk move for move a file or namespace from one namespace to another. Absolute paths have to be used: slk move /old_path/file.nc /new_path. Renaming cannot be done with slk move. I.e. this does not work: slk move /old_path/file.nc /new_path/new_file_name.nc. Please use slk rename for renaming operations.

Is it possible to rename files within the archive?#

Yes you can use slk rename to rename a file or namespace. slk rename cannot be applied on multiple files/namespaces

How do I tag a folder with metadata?#

Tagging folders with metadata is not possible in the moment.

How do I tag an individual file with metadata?#

slk tag accepts individual files as input since slk version 3.3.56. If you want to tag several files in different folders, please run a search for these files and use the resulting search id as input for slk tag. Alternatively, you can use slk_helpers json2hsm which accepts metadata in JSON as input. Details: Set metadata.

How do I add metadata to a file?#

Please see How do I tag an individual file with metadata?.

How do I assign metadata to a file?#

Please see How do I tag an individual file with metadata?.

Can I overwrite metadata manually?#

Yes. Please see How do I tag an individual file with metadata?.

slk search does not find any resources although resources exist that seem to match the query#

Example command:#

$ slk search '{"resources.posix_uid": "25301"}'
Search continuing. .....
Search ID: 216

Reason:#

The query parser does not recognize when a wrong variable type is used. resources.posix_uid is of type integer and not string. Providing the wrong data types leads to 0 found results.

Solution:#

Write 25301 (integer) instead of "25301" (string) in the search query.

slk search '{"resources.posix_uid": 25301}'
Search continuing. ..... Search continuing. .....

Search ID: 217

Error: slk search yields RQL parse error#

Example command and error:#

$ slk search "{\"resources.size\":{\"$gt\": 1048576}}"
ERROR: Search failed. Reason: RQL parse error: No period found in collection field name ().

Reason:#

The $ in front of the gt was not escaped. Therefore, $gt is interpreted as environment variable by the shell before the query is handed to the slk. In most situations, no environment variable gt is defined leading to an empty string. If the query were surrounded by ' as delimiter and not by " then the $gt would not have been interpreted.

The above call of slk search as interpreted by the shell looks like

$ slk search "{\"resources.size\":{\"\": 1048576}}"

Solution:#

Either: use ' as delimiter of your search query instead of " to prevent operators starting with $ to be evaluated by your shell

Or: escape $’s in front of query operators by \ when you use " as delimiters of the query string.

'{"resources.size":{"$gt": 1048576}}'
"{\"resources.size\":{\"\$gt\": 1048576}}"

Note

In some situations it might be very useful to use " as delimiter for your queries – e.g. if environment variables are part of your query.

$ export file_size=1048576
$ slk search "{\"resources.size\":{\"\$gt\": $file_size}}"

Advanced Technical Aspects#

Can a user influence if data is written into the HSM cache or onto tape?#

No. Fresh data (meant for archival) are first copied into the disc cache and then slowly written onto tape. They are removed from the HSM cache some time after they have been successfully written to tape. When data are retrieved from tape, they are first copied into the HSM disc cache and from there to the user-defined target file system. They are removed from the HSM cache after a grace period. Small files of a few MB in size or smaller remain in the HSM cache nearly permanently. The exact size threshold varies depending on the fill state of the HSM cache. Additionally, there is no guarantee that even a file of 1 byte size is never removed from the cache.

How much time does a file stay on the cache?#

We cannot give any numbers. The residence time in cache depends on the size of the files and the usage of the cache. We run clean up jobs regularly and monitor how fast the cache is filled.

How fast can be read from the HSM?#

The target transfer rate between single nodes on Levante and the HSM cache is 1 GB/s. It might be higher in some situations and be reduced when the traffic is high. The retrieval rate from tape considerably depends on how many other read and write operations of other users are performed in parallel. The maximum read rate from tape is 300 MB/s. If all tape drives are in use, the request is queued and slk retrieve will be idle until a tape drive is free. The queue length for tape read jobs can be printed by slk_helpers job_queue. Please see Waiting and processing time of retrievals for details.

How do I determine the id (uid) of a DKRZ user?#

Please use one of the following commands:
# get your user id
$ id -u

# get the id of any user
$ id USER_NAME -u

# get the id of any user
$ getent passwd USER_NAME
#  OR
$ getent passwd USER_NAME | awk -F: '{ print $3 }'

How do I determine the id (gid) of a DKRZ group?#

Please use one of the following commands:
# get group ID and group members
$ getent group GROUP_NAME
#  OR
$ getent group GROUP_NAME | awk -F: '{ print $3 }'

# get groups and their ids of all groups of which member you are
$ id

How do I determine the username of a DKRZ user when I have her/his id (uid)?#

Please use the following command:
# get the name of a user with uid USER_ID
$ getent passwd USER_ID
#  OR
$ getent passwd USER_ID | awk -F: '{ print $1 }'

How do I determine the group name of a DKRZ group when I have its id (gid)?#

Please use one of the following commands:
# get group name of a groupd with gid GROUP_ID
$ getent group GROUP_ID
#  OR
$ getent group GROUP_ID | awk -F: '{ print $1 }'

How do I determine the MIME type of a file?#

You could use file --mime-type FILE or file -b --mime-type FILE to determine the MIME type on the Linux shell. Please be aware that different tools determine the MIME type differently (i.e. by file header or by file extension) and MIME type databases might differ. It might be better not to search for a specific MIME type but for a particular file extension – e.g. via {"resources.name": {"$regex": ".*nc$"}}. StrongLink allocates the MIME type application/x-netcdf to netCDF files.

Can the search ID of slk search be captured by a shell variable?#

slk search do not provide this feature out of the box. Currently (might change in future versions), the search ID is printed in columns >= 12 of the second row of the text output of slk search. We can use tail and sed to get the second line and extract a number or use tail and cut to get the second line and drop the first 11 characters. Example:
# normal call of slk search
$ slk search '{"resources.posix_uid": 23501}'
Search continuing. .....
Search ID: 466

# get ID using sed:
$ search_id=`slk search '{"resources.posix_uid": 23501}' | tail -n 1 | sed 's/[^0-9]*//g'`
$ echo $search_id
470

# get ID by dropping first 11 characters of the second line
$ search_id=`slk search '{"resources.posix_uid": 23501}' | tail -n 1 | cut -c12-20`
$ echo $search_id
471

# use awk pattern matching to get the correct line and correct column
$ search_id=`slk search '{"resources.posix_uid": 25301}' | awk '/Search ID/ {print($3)}'`
$ echo $search_id
507

Note

This is an example for bash. When using csh, you need to prepend set `` in front of the assignments of the shell variables: ``set search_id=....

Is the metadata of files within zip/tar files evaluated/ingested?#

No, the metadata of packed files is not ingested. However, this feature has been requested to be implemented by StrongLink.

Does the packems package work with the new HSM system?#

Yes, packems has been adapted to the new HSM system in coorperation with the MPI-M. unpackems needs slk retrieve to work and, hence, does not run on the login nodes. For packems we recommend having slk retrieve available. Additionally, packing and archiving files causes a high CPU and memory load. Therefore, we strongly recommend not using packems on the login nodes, Instead, please use the interactive, shared or compute partitions. Please have a look into our packems quick help and into the packems manual: https://code.mpimet.mpg.de/projects/esmenv/wiki/Packems.

Is it possible to use listems to list files that were archived with packems on the HPSS?#

Yes, that’s possible.

Is it possible to use unpackems to retrieve files that were archived with packems on the HPSS?#

Yes, that’s possible. However, unpackems does not run on the login nodes. Please see Does the packems package work with the new HSM system? for details.

Can you work directly with files in the archive (e.g. with Python)?#

No, you have to download files to change them and archive them again.

Terminal cursor disapears after stopping a slk command. How to get it back?#

If a slk command with a progress bar is canceled by the user, the shell cursor might disappear. One can make it re-appear by (a) running reset or (b) starting vim and leaving it directly (:q!).

Is a file stored in the HSM cache or exclusively on tape?#

Solution a: In the output of slk list, please check the 11th character of the first column (permissions string). If this character is t then the file is exclusively stored on tape. If it is a - then the file is available from the HSM cache.

Solution b: Use slk_helpers iscached RESOURCE_PATH to check whether a file available from the HSM cache (exit code is 0) or not (exit code is 1).

What is an exit code?#

While exiting, each program returns an integer number, which indicates whether the program finished successfully or not. Exit codes are not printed to the stdout or stderr streams but need to be explicitely captured by the user. An exit code of 0 indicates that everything went well. Exit codes >0 indicate that something went wrong. Non-zero exit codes do not necessarily mean that an error occurred. If grep or find do not match/find anything and not error occurs, they return 1 as exit code. These commands return 2 if an error occurs.

How do I capture exit codes?#

The shell variable $? contains the exit code of the preceeding command. Examples:

# successful program call
$ slk version
SCLI Version 3.3.21
$ echo $?
0

# failed program call
$ slk retrieve abc def
ERROR: No resource exists with the following path: abc
echo $?
1

When commands are combined, the exit code of the command which exits last is available. Examples:

# we capture the exit code of slk list
$ slk list quatsch
ERROR: The list command requires a search ID, or a full namespace path starting with a forward slash (/).
$ echo $?
1

# we capture the exit code of cat (which exits successfully)
$ slk list quatsch | cat
ERROR: The list command requires a search ID, or a full namespace path starting with a forward slash (/).
$ echo $?
0

In some situations it might be valuable to store the exit code in a variable. This example is from a script:

...
slk retrieve real_data.nc /arch/bm0146/k204221/test_data
exit_code=$?
if [ $exit_code -ne 0 ]; then
    >&2 echo "an error of $exit_code occurred at `date` in slk retrieve call. Proceeding with next retrieval"
else
    echo "retrieval successful"
fi
...

If time is used with another command, the other command’s exit code is always returned also time finalizes last.


Which exit codes does slk return?#

Please see here

Which exit codes do the slk_helpers return?#

Please see here

Why is slk automatically “killed”?#

We run slk and it becomes killed with a similar message as this:

/sw/spack-levante/slk-3.3.21-5xnsgp/bin/slk: line 16: 126083 Killed                  LC_ALL=en_US.utf8 LANG=en_US.utf8 ${SLK_JAVA} -Xmx4g -jar $JAR_PATH "$@"

slk was killed by the operating system because it used more system resources than it was allowed to use – commonly to high memory usage. Please allocate sufficient memory to your job: run salloc or sbatch with --mem=6GB. If your slk is still killed with a this message, then please inform the DKRZ support (support@dkrz.de) and allocate 8 GB or 10 GB of memory. If you ran slk on a login node of Levante, please switch to the interactive partition for interactive usage of slk (Run slk in the “interactive” partition) or to the compute or shared partitions for batch processing (Run slk as batch job).

Common issues#

Please see the extra page Known Issues

Changelog#

v1.43, 08 December 2023#

v1.42, 07 December 2023#

v1.41, 01 December 2023#

v1.40, 16 January 2023#

v1.39, 08 December 2022#

v1.38, 18 October 2022#

v1.37, 14 June 2022#

  • grammar and spelling corrections in various questions

v1.36, 07 June 2022#

v1.35, 03 June 2022#

v1.34, 19 April 2022#

v1.33, 30 March 2022#

v1.32, 28 February 2022#

v1.31, 11 February 2022#

v1.30, 06 December 2021#

v1.29, 18 November 2021#

  • modified: Session key has expired

v1.28, 12 November 2021#

v1.27, 11 November 2021#

v1.26, 01 November 2021#

v1.25, 27 October 2021#

v1.24, 23 October 2021#

  • modified: When does the new HSM system go online?

  • modified: Why is no exact time schedule for training and migration published yet?

v1.23, 15 October 2021#

  • modified: When does the new HSM system go online?

  • modified: When does the HPSS go offline? (removed in FAQ version 1.32)

  • modified: Why is no exact time schedule for training and migration published yet?

v1.22, 08 October 2021#

  • modified: When does the new HSM system go online?

  • modified: When does the HPSS go offline? (removed in FAQ version 1.32)

  • modified: Why is no exact time schedule for training and migration published yet?

v1.21, 01 October 2021#

  • new: Archival fails and Java NullPointerException in the log

v1.20, 29 September 2021#

  • modified: Why is no exact time schedule for training and migration published yet?

v1.19, 20 September 2021#

  • changed title of FAQ

  • corrected FAQ’s Changelog

v1.18, 17 September 2021#

  • added cross-references

  • minor layout changes

v1.17, 17 September 2021#

  • modified: When does the new HSM system go online?

  • modified: When does the HPSS go offline? (removed in FAQ version 1.32)

  • modified: Why is no exact time schedule for training and migration published yet?

  • modified: Who do I contact when I have questions or issues regarding the new HSM system and its usage?

v1.16, 17 August 2021#

  • modified: When does the new HSM system go online?

  • modified: When does the HPSS go offline? (removed in FAQ version 1.32)

  • modified: Why is no exact time schedule for training and migration published yet?

v1.15, 30 July 2021#

  • modified: When does the new HSM system go online?

  • modified: When does the HPSS go offline? (removed in FAQ version 1.32)

  • modified: Why is no exact time schedule for training and migration published yet?

  • new: How do I use the HSM/StrongLink test system? (removed in FAQ version 1.27)

  • new: How to print the version of slk?

  • new: Session key has expired

  • new: Login Unsuccessful - Incorrect Credentials

v1.14, 12 July 2021#

v1.13, 29 June 2021#

  • modified: When does the new HSM system go online?

  • modified: When does the HPSS go offline? (removed in FAQ version 1.32)

  • modified: Why is no exact time schedule for training and migration published yet?

v1.12, 08 June 2021#

v1.11, 06 May 2021#

v1.10, 23 April 2021#

v1.09, 06 April 2021#

v1.08, 12 March 2021#

v1.07, 10 March 2021#

v1.06, 08 March 2021#

v1.05, 23 February 2021#

v1.04, 22 February 2021#

v1.03, 18 February 2021#

v1.02, 12 February 2021#

v1.01, 28 January 2021#

  • first public version