FAQ#
v1.43, 08 Dec 2023
General information about the HSM system#
What does HSM mean?#
What type of HSM system is used at DKRZ?#
Would it be possible/desirable to use only one command for slk and slk_helpers main classes?#
slk
is developed by the StrongLink developers and slk_helpers
are developed by the DKRZ staff. A command which we implement in the slk_helpers
might be replaced by an official slk
command later on. It might cause problems if one command would be replaced by another one with the same name. Additionally, the development cycle of slk
is much slower than that of slk_helpers
. Thus, users need to live with slk
as it is while we can quickly publish bugfixes and other changes for slk_helpers
.Training, Questions and Adaption of Workflows#
Has there been an introduction session to the new HSM system and will there be such sessions in future?#
Yes, two DKRZ Tech Talks took place on 6 July 2021 and 12 July 2023. The new HSM system and the new command line tool slk
and hints for retrievals were presented there, respectively. Recordings of the Tech Talks available on YouTube:
Where can I find written documentation about the new HSM system?#
Who do I contact when I have questions or issues regarding the new HSM system and its usage?#
Archiving and Retrieval#
How do I interact with the new system?#
slk
. Additionally, the slk
is missing a few small but very useful features. Therefore, a tool called slk_helpers
was written at DKRZ to add these features. Details on these two tools are provided in the HSM Documentation at https://docs.dkrz.de . Additionally, a Python wrapper pyslk
exists and packems
was adapted to slk
.Where can I use slk
and slk_helpers
?#
slk
/ slk_helpers
are be installed as module slk
on all Lenvate nodes. There are no limitations for the usage of slk
commands except for retrieve
and archive
.On Levante login nodes: slk retrieve
can only retrieve one file at once. Please use slk archive
only for a few small files.
If you wish to use slk archive
or slk retrieve
interactively, please use the interactive
partition and allocate 6 GB of memory for this purpose (via salloc --mem=6GG
, see: Run slk in the “interactive” partition and Data Processing on Levante). For non-interactive archival and retrieval tasks, we recommend using the compute
and shared
partitions. We strongly recommend allocating 6 GB
of memory in your batch scripts (--mem=6GB
, see: Run slk as batch job, details: slk memory footprint). If your slk
is killed with a message like /sw/[...]/bin/slk: line 16: [...] Killed
, then please inform the DKRZ support (support@dkrz.de) and allocate 8 GB
or 10 GB
of memory.
How do I login to the HSM system?#
Login is done via the command line tool slk login
using your DKRZ credentials (LDAP; like used for luv). The command line tool stores a login token in ~/.slk/config.json for a specific time period (currently 30 days) so that you do not even need to go through the process of logging in for that period.
Please have a look into How do I automatically/non-interactively check whether I own a valid slk login token? when you would like to be reminded when the token is due to expire.
Does Kerberos authentication work on the new HSM system?#
Do I have to provide my login credentials each time I use the command line tool?#
Please have a look into How do I automatically/non-interactively check whether I own a valid slk login token? when you would like to be reminded when the token is due to expire.
Can I use the command line tool non-interactively?#
slk
commands write there output into the terminal buffer and not to stdout
or stderr
. Thus, no output could be capured in non-interactive jobs – only exit codes. Amongst others, slk retrieve
and slk archive
are affected. slk archive -vv
writes to stdout
.Can I access archived data from outside the DKRZ?#
Do I have write access to the archive from outside the DKRZ?#
slk
is not made for data transfer via the internet.Will the new system be available as a Globus endpoint for external transfers?#
What happens if I archive a soft link?#
How do I create directories in the HSM?#
slk archive
automatically creates needed target folders. If you want to create a directory manually, please use slk_helpers mkdir /ex/am/ple/dir
or slk_helpers mkdir -R /ex/am/ple/dir
. If you want to create several nested folders (like mkdir -p
does) please use slk_helpers mkdir -R <new namespace>
– without -R
only the right-most directory will be created. If you do not want to use the slk_helpers
, please do as follows: create empty directories locally, fill them with non-empty dummy files and archive them via slk archive -R
.Does StrongLink automatically check the integrity of archived and retrieved files?#
No. There are different tools for different situations with which you can check your archived files.
Files may be partly archived when slk archive
is killed or end with end error while these files are archived. Ends with an error means that an exit code other than 0
is returned. Killed means events like: timeout of SLURM job script, manual termination via CTRL + C, process killed, timeout/disconnect of ssh session, killed by the operating system due to exceeded memory limit, … . Currently, such files are displayed by slk list
.
In order to identify partly archived files, please run slk_helpers has_no_flag_partial
and try to archive these files again with slk archive -vv ...
. If files are listed by slk_helpers has_no_flag_partial
but are skipped by slk archive
, please notify us via support@dkrz.de to perform additional checks and to remove this flag.
If slk archive
finishes without errors (== exit code is 0
), you can assume that the archived files have been properly copied to the HSM cache. StrongLink does not do a full integrity check of all archived files. However, strongLink stores checksums for all successfully archived files. You can manually get the stored checksums via slk_helpers checksum RESOURCE
and compare them against the checksum of the local copies of the respective files. Theoretically, StrongLink offers a service to read random files from tape and to verify them against their checksums. Currently, this service is deactivated in order to reduce the tape traffic.
Based on our extensive validation tests of StrongLink: in all situations where slk archive was not interrupted and where an “Archive report” was written into the slk log (~/.slk/slk-cli.log), all archived files listed by slk list were complete and correct. However, bit flips and similar events might occur from time to time. Therefore, if you archive data, which are very important or very expensive to reproduce, we recommend validating the archived files via their checksums. For details, please have a look into Validate archivals.
Please see the question Is there an option to continue archiving if it was interrupted? if you wish to know how to deal with incompletely/partly archived files.`
Do I manually need to check the integrity of archived and retrieved files?#
Please see “Does StrongLink automatically check the integrity of archived and retrieved files?”
Is there an option to continue archiving if it was interrupted?#
Yes. If the archival of several files was interrupted, please run the same call of slk archive
a second time. The slk archive
will only transfer those files, which (a) have not already been archived, (b) have only been partly archived or (c) have been modified since the first archival. The files, which are partly archived, are those files, which were currently transferred when slk archive
was killed. Please be aware that these incomplete files are listed by slk list
and may even have a checksum (of the incomplete file). Therefore, please check with slk_helpers has_no_flag_partial
whether StrongLink flagged these files as partial files and notify us via support@dkrz.de when the flag remains even after slk archive
finished successful. Alternatively, you could compare the checksum of the original file with the checksum calculated by StrongLink (slk_helpers checksum ...
).
Details: Validate archivals
Does any command exist for deleting files immediately from /work in case of successful archival?#
Is it possible to archive into my existing folder structure created on HPSS?#
/hpss
was dropped.Is there a “double” storage feature as for HPSS?#
What does “namespace”, “global namespace” or “gns” mean?#
How do I automatically/non-interactively check whether I own a valid slk login token?#
slk_helpers session
. slk
does not provide a command that returns the status of the login tokes as true
/false
, valid
/invalid
. If you do not want to use the slk_helpers
but check the status of the login token anyway, please use this command:$ test `date -d "$(jq .expireDate ~/.slk/config.json | sed 's/"//g')" +'%s'` -gt `date +%s`
$?
will be 0
if login token is valid and 1
if not.
You need to have the program jq
available which is the case on Levante.
If you wish to be reminded when your token is due to expire, you can submit a SLURM script which does this for you. A script for this purpose and a description are given in Reminder login token expires.
Is my slk login token still valid?#
How to I check for how long my login token is still valid?#
slk_helpers session
will print the expiration date. Alternatively, the date/time until when the login token is stored in the slk config file (~/.slk/config.json
). The key is expirationDate
. You might open the config file with a text editor or print its content with tools like cat
, less
or jq
.jq .expireDate ~/.slk/config.json
You need to have the program jq
available. jq
is installed on Levante and available without loading any package.
If you wish to be reminded when your token is due to expire, you can submit a SLURM script which does this for you. A script for this purpose and a description are given in Reminder login token expires.
Can I provide a file list to “slk archive” such as “-T” for “tar”?#
slk archive
Can a user run multiple archival and retrieval requests at a time?#
slk archive
and slk retrieve
might be hardware-limited on shared nodes when other users copy data as well (see slk data transfer rate). Hence, splitting archival/retrieval requests up into multiple ones does not necessarily increase the transfer speed. We recommend aggregating file retrievals.Where on Levante should I run slk?#
Please see Where can I use “slk” and “slk_helpers”?
How does slk archive the files: does it tar them itself (similar to packems) or should we tar the files before hand?#
slk
does not packs/tar files. Metadata from netCDF files is automatically imported into the StrongLink database to simplify search and retrieval later on. Direct archiveal of nc-files is preferable with respect to the metadata import feature. However, many small files are bad for tape performance and might cost additional storage space (see Storage options and quota. Therefore, the usage of packems is reasonable in the case of large amount of very small files.Are there requirements on the file size for the tape archival?#
Preferred file size: 10 GB to 200 GB. Slower transfer rate when files larger than 200 to 250 GB.
Each file smaller than 1 GB will be charged 1 GB.
Lower size limit: small files are not optimal for tape storage. Therefore, we encourage users to pack small files if there is no need to use the netCDF metadata features of StrongLink (see File Search and Metadata).
Upper size limit: file sizes of a few TB are possible and have been successfully tested, but we recommend the same sizes as for HPSS: max. 500 GB.
I am member of a project but cannot access this projects data?#
If you were added to the project recently, please login again via slk login
. For details please see group memberships of user updated on login.
If you are member in the particular project for a long time or have followed the previous instructions, please let another user of the project check whether the group permissions are properly set.
If both approaches do not work, please contact support@dkrz.de
Why do I get “Exception …: lateinit property websocket has not been initialized”?#
When running slk archive
with the argument --streams N
please do only use values between 1
and 4` for N
. For details please see slk archive: Exception …: lateinit property websocket has not been initialized.
My slk archive seems to hang. What should I do?#
Please check whether /home
is hanging. If /home
is hanging, slk
cannot access its login token and cannot write into its log. Therefore, slk
hangs when /home
is hanging. You might also run slk archive
with -v
or -vv
to see the progress.
How much data can I archive at once?#
We suggest to archive not more than approximately 5 TB with one call of slk archive. If you archive more than that and the StrongLink system is busy, the transfer might be interrupted unexpectely. The slk
log (~/.slk/slk-cli.log
) will show this error (look for unexpected end of stream on https://archive.dkrz.de/...
):
2022-11-24 11:16:22 INFO Executing command: "archive -R /work/ab1234/c567890/much_data /arch/zy0987/c567890/target
2022-11-24 11:18:25 ERROR Unexpected exception
java.io.IOException: unexpected end of stream on https://archive.dkrz.de/...
at
okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:202)
~[slk-cli-tools-3.3.21.jar:?]
[...]
[...]
[...]
... 16 more
2022-11-24 11:18:25 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 0/85083 files [0B/20.3T]
If you want to keep archiving larger amounts than 10 TB at once – e.g. 100 TB –, please be prepared to run slk archive repeatedly. In the end, a summary similar to this one should be printed to the log:
2022-11-25 11:21:10 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 0/85083 files [0B/20.3T]
Total files skipped: 85083/85083 files [20.3T/20.3T]
Unchanged files: 85083
or this one:
2022-11-25 09:46:55 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 4342/85083 files [1.2B/20.3T]
Total files skipped: 80741/85083 files [20.3T/20.3T]
Unchanged files: 80741
What does the error “unexpected end of stream on https://archive.dkrz.de/…” mean?#
This error is not printed to the command line but into the slk log. The full error looks like this:
2022-11-24 11:16:22 INFO Executing command: "archive -R /work/ab1234/c567890/much_data /arch/zy0987/c567890/target
2022-11-24 11:18:25 ERROR Unexpected exception
java.io.IOException: unexpected end of stream on https://archive.dkrz.de/...
at
okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:202)
~[slk-cli-tools-3.3.21.jar:?]
[...]
[...]
[...]
... 16 more
2022-11-24 11:18:25 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 0/85083 files [0B/20.3T]
This error might be thrown if a large amount of data is archive (approximately more than 5TB to 10TB), while the StrongLink system is under high load due to many user archivals and retrievals.
The output 0/85083
is not necessarily correct and some files might have been archived. The slk
client does not get confirmation of successful archival before it stops due to an interrupted data stream. The philosophy of slk
in this context is: better an error is displayed, although the file was archived correctly, than no error is displayed, although the file was not archived correctly.
Additional features#
Which new features does the HSM System provide?#
slk search
. Details on the search feature: File Search and Metadata.From which file types are extended metadata harvested?#
Which metadata fields are harvested from netCDF files?#
Is there a python interface available?#
pyslk
. It is installed in python3/2022.01-gcc-11.2.0
, python3/2023.01-gcc-11.2.0
and python3/unstable
on Levante. The latter module might contain a newer pyslk
version which is in the final testing phase. The slk
module as to be loaded when pyslk
is used. See for details please see https://hsm-tools.gitlab-pages.dkrz.de/pyslk .Is it possible to use slk chmod and slk group (=chgrp) commands recursively by the user?#
-R
to apply these commands recursively.Are the search IDs user specific?#
slk list SEARCH_ID
or retrieval of slk retrieve SEARCH_ID ...
depends on the read permissions of the executing user.How long are the search IDs stored?#
Is a search ID automatically updated when new files are archived which match the original search query?#
slk list SEARCH_ID
will show todays sizes of files covered by the search ID SEARCH_ID
. Files that first matched the a search query are still listed by slk list
even if they no longer match the original search query. This might happen if a file is renamed.What does “RQL” mean?#
What is the “StrongLink Query Language”?#
slk search
. Please see StrongLink query language Reference and File Search and Metadata for details.Is there any possibility to move around in the file system with something like the cd command?#
slk
does not start its own shell like pftp
or pure ftp
do. It rather works like scp
or rsync
.When slk list shows a file with “-” (not “t”) which means it exists at the cache: Does that mean it is not yet on the tape?#
For a better overview of the archived files, Is there a possibility to list only folders, not all files?#
slk list
with a specific namespace path, it shows all the files and namespaces in that specific namespace. It does not provide an argument like -d
as ls
does. You might use slk list GNS_PATH | grep -E "^d"
to print only folders.Alternatively, slk_helpers list_search -d SEARCH_ID
can be used to print all namespaces that have been found by the search SEARCH_ID
. Details on searches: File Search and Metadata
Is it possible to remove files from the archive?#
slk delete
for removing files and slk delete -R
for removing namespaces and their content. Using wildcards or Regular Expressions with slk delete
is not possible.How to print the version of slk?#
slk version
to print the version of slk
. A --version
flag or similar does not exist.How to search non-recursively in a namespace?#
By default, slk search
searches recursively in a namespace provided via path
. To do a non-recursive search, please add "$max_depth": 1
to the JSON expressions as follows: {"path": {"$gte": "/ex/am/ple/path", "$max_depth": 1}}
. Alternatively, please get the object id of the particular namespace via slk_helpers exists
and, then in your search query, use it as value for the search field resources.parent_id
(see slk Usage Examples)
Is it possible to move files within the archive?#
slk move
for move a file or namespace from one namespace to another. Absolute paths have to be used: slk move /old_path/file.nc /new_path
. Renaming cannot be done with slk move
. I.e. this does not work: slk move /old_path/file.nc /new_path/new_file_name.nc
. Please use slk rename
for renaming operations.Is it possible to rename files within the archive?#
slk rename
to rename a file or namespace. slk rename
cannot be applied on multiple files/namespacesHow do I tag a folder with metadata?#
Tagging folders with metadata is not possible in the moment.
How do I tag an individual file with metadata?#
slk tag
accepts individual files as input since slk version 3.3.56. If you want to tag several files in different folders, please run a search for these files and use the resulting search id as input for slk tag
. Alternatively, you can use slk_helpers json2hsm
which accepts metadata in JSON as input. Details: Set metadata.
How do I add metadata to a file?#
Please see How do I tag an individual file with metadata?.
How do I assign metadata to a file?#
Please see How do I tag an individual file with metadata?.
Can I manually edit file metadata in StrongLink?#
Yes. Please see How do I tag an individual file with metadata?.
Can I overwrite metadata manually?#
Yes. Please see How do I tag an individual file with metadata?.
slk search does not find any resources although resources exist that seem to match the query#
Example command:#
$ slk search '{"resources.posix_uid": "25301"}'
Search continuing. .....
Search ID: 216
Reason:#
The query parser does not recognize when a wrong variable type is used. resources.posix_uid
is of type integer
and not string
. Providing the wrong data types leads to 0 found results.
Solution:#
Write 25301
(integer) instead of "25301"
(string) in the search query.
slk search '{"resources.posix_uid": 25301}'
Search continuing. ..... Search continuing. .....
Search ID: 217
Error: slk search yields RQL parse error#
Example command and error:#
$ slk search "{\"resources.size\":{\"$gt\": 1048576}}"
ERROR: Search failed. Reason: RQL parse error: No period found in collection field name ().
Reason:#
The $
in front of the gt
was not escaped. Therefore, $gt
is interpreted as environment variable by the shell before the query is handed to the slk
. In most situations, no environment variable gt
is defined leading to an empty string. If the query were surrounded by '
as delimiter and not by "
then the $gt
would not have been interpreted.
The above call of slk search
as interpreted by the shell looks like
$ slk search "{\"resources.size\":{\"\": 1048576}}"
Solution:#
Either: use '
as delimiter of your search query instead of "
to prevent operators starting with $
to be evaluated by your shell
Or: escape $
’s in front of query operators by \
when you use "
as delimiters of the query string.
'{"resources.size":{"$gt": 1048576}}'
"{\"resources.size\":{\"\$gt\": 1048576}}"
Note
In some situations it might be very useful to use "
as delimiter for your queries – e.g. if environment variables are part of your query.
$ export file_size=1048576
$ slk search "{\"resources.size\":{\"\$gt\": $file_size}}"
Advanced Technical Aspects#
Can a user influence if data is written into the HSM cache or onto tape?#
How much time does a file stay on the cache?#
How fast can be read from the HSM?#
slk retrieve
will be idle until a tape drive is free. The queue length for tape read jobs can be printed by slk_helpers job_queue
. Please see Waiting and processing time of retrievals for details.Do GIGA-files still exists in StrongLink?#
slk search
tool followed by slk list
to generate a list of all files of your project:# NOTE: PATH might be different in final HSM setup
$ SEARCH_ID=`slk search "{\"path\":{\"\$gte\":\"/ex/am/ple/arch/bm0146\"}}"`
# OR
$ SEARCH_ID=`slk search "{\"resources.posix_gid\":1076}"`
$ slk list $SEARCH_ID
How do I determine the id (uid) of a DKRZ user?#
# get your user id
$ id -u
# get the id of any user
$ id USER_NAME -u
# get the id of any user
$ getent passwd USER_NAME
# OR
$ getent passwd USER_NAME | awk -F: '{ print $3 }'
How do I determine the id (gid) of a DKRZ group?#
# get group ID and group members
$ getent group GROUP_NAME
# OR
$ getent group GROUP_NAME | awk -F: '{ print $3 }'
# get groups and their ids of all groups of which member you are
$ id
How do I determine the username of a DKRZ user when I have her/his id (uid)?#
# get the name of a user with uid USER_ID
$ getent passwd USER_ID
# OR
$ getent passwd USER_ID | awk -F: '{ print $1 }'
How do I determine the group name of a DKRZ group when I have its id (gid)?#
# get group name of a groupd with gid GROUP_ID
$ getent group GROUP_ID
# OR
$ getent group GROUP_ID | awk -F: '{ print $1 }'
How do I determine the MIME type of a file?#
file --mime-type FILE
or file -b --mime-type FILE
to determine the MIME type on the Linux shell. Please be aware that different tools determine the MIME type differently (i.e. by file header or by file extension) and MIME type databases might differ. It might be better not to search for a specific MIME type but for a particular file extension – e.g. via {"resources.name": {"$regex": ".*nc$"}}
. StrongLink allocates the MIME type application/x-netcdf
to netCDF files.Can the search ID of slk search be captured by a shell variable?#
slk search
do not provide this feature out of the box. Currently (might change in future versions), the search ID is printed in columns >= 12 of the second row of the text output of slk search
. We can use tail
and sed
to get the second line and extract a number or use tail
and cut
to get the second line and drop the first 11 characters. Example:# normal call of slk search
$ slk search '{"resources.posix_uid": 23501}'
Search continuing. .....
Search ID: 466
# get ID using sed:
$ search_id=`slk search '{"resources.posix_uid": 23501}' | tail -n 1 | sed 's/[^0-9]*//g'`
$ echo $search_id
470
# get ID by dropping first 11 characters of the second line
$ search_id=`slk search '{"resources.posix_uid": 23501}' | tail -n 1 | cut -c12-20`
$ echo $search_id
471
# use awk pattern matching to get the correct line and correct column
$ search_id=`slk search '{"resources.posix_uid": 25301}' | awk '/Search ID/ {print($3)}'`
$ echo $search_id
507
Note
This is an example for bash
. When using csh
, you need to prepend set `` in front of the assignments of the shell variables: ``set search_id=...
.
Is the metadata of files within zip/tar files evaluated/ingested?#
Is it possible to create symlinks between lustre_path/files and tape_path/files?#
Does the packems package work with the new HSM system?#
packems
has been adapted to the new HSM system in coorperation with the MPI-M. unpackems
needs slk retrieve
to work and, hence, does not run on the login nodes. For packems
we recommend having slk retrieve
available. Additionally, packing and archiving files causes a high CPU and memory load. Therefore, we strongly recommend not using packems
on the login nodes, Instead, please use the interactive
, shared
or compute
partitions. Please have a look into our packems quick help and into the packems manual: https://code.mpimet.mpg.de/projects/esmenv/wiki/Packems.Is it possible to use listems to list files that were archived with packems on the HPSS?#
Is it possible to use unpackems to retrieve files that were archived with packems on the HPSS?#
unpackems
does not run on the login nodes. Please see Does the packems package work with the new HSM system? for details.Can you work directly with files in the archive (e.g. with Python)?#
Terminal cursor disapears after stopping a slk command. How to get it back?#
If a slk command with a progress bar is canceled by the user, the shell cursor might disappear. One can make it re-appear by (a) running reset
or (b) starting vim
and leaving it directly (:q!
).
Is a file stored in the HSM cache or exclusively on tape?#
Solution a: In the output of slk list
, please check the 11th character of the first column (permissions string). If this character is t
then the file is exclusively stored on tape. If it is a -
then the file is available from the HSM cache.
Solution b: Use slk_helpers iscached RESOURCE_PATH
to check whether a file available from the HSM cache (exit code is 0
) or not (exit code is 1
).
What is an exit code?#
While exiting, each program returns an integer number, which indicates whether the program finished successfully or not. Exit codes are not printed to the stdout
or stderr
streams but need to be explicitely captured by the user. An exit code of 0
indicates that everything went well. Exit codes >0
indicate that something went wrong. Non-zero exit codes do not necessarily mean that an error occurred. If grep
or find
do not match/find anything and not error occurs, they return 1
as exit code. These commands return 2
if an error occurs.
How do I capture exit codes?#
The shell variable $?
contains the exit code of the preceeding command. Examples:
# successful program call
$ slk version
SCLI Version 3.3.21
$ echo $?
0
# failed program call
$ slk retrieve abc def
ERROR: No resource exists with the following path: abc
echo $?
1
When commands are combined, the exit code of the command which exits last is available. Examples:
# we capture the exit code of slk list
$ slk list quatsch
ERROR: The list command requires a search ID, or a full namespace path starting with a forward slash (/).
$ echo $?
1
# we capture the exit code of cat (which exits successfully)
$ slk list quatsch | cat
ERROR: The list command requires a search ID, or a full namespace path starting with a forward slash (/).
$ echo $?
0
In some situations it might be valuable to store the exit code in a variable. This example is from a script:
...
slk retrieve real_data.nc /arch/bm0146/k204221/test_data
exit_code=$?
if [ $exit_code -ne 0 ]; then
>&2 echo "an error of $exit_code occurred at `date` in slk retrieve call. Proceeding with next retrieval"
else
echo "retrieval successful"
fi
...
If time
is used with another command, the other command’s exit code is always returned also time finalizes last.
Which exit codes does slk return?#
Please see here
Which exit codes do the slk_helpers return?#
Please see here
Why is slk automatically “killed”?#
We run slk
and it becomes killed with a similar message as this:
/sw/spack-levante/slk-3.3.21-5xnsgp/bin/slk: line 16: 126083 Killed LC_ALL=en_US.utf8 LANG=en_US.utf8 ${SLK_JAVA} -Xmx4g -jar $JAR_PATH "$@"
slk
was killed by the operating system because it used more system resources than it was allowed to use – commonly to high memory usage. Please allocate sufficient memory to your job: run salloc
or sbatch
with --mem=6GB
. If your slk
is still killed with a this message, then please inform the DKRZ support (support@dkrz.de) and allocate 8 GB
or 10 GB
of memory. If you ran slk
on a login node of Levante, please switch to the interactive
partition for interactive usage of slk
(Run slk in the “interactive” partition) or to the compute
or shared
partitions for batch processing (Run slk as batch job).
Why does slk list hang when I want to list the results of a search?#
Problem:#
slk list SEARCH_ID
seems to hang but can be killed without problems.
Reason:#
slk list SEARCH_ID
collects all search results, first, and, then, prints them. The run time of slk list
linearly scales with the number of them (20s to 60s per 1000 results). Hence, if you want to print a list of 10000 files which were found by slk search
you might have to wait 5 minutes until the list is printed.
Alternatively to slk list
, you can run slk_helpers list_search
on the same SEARCH_ID
which will continuously print collected search results.
Solution:#
Please refine your search. The section Search files by metadata might help in this context.
Common issues#
Please see the extra page Known Issues
Changelog#
v1.43, 08 December 2023#
v1.42, 07 December 2023#
modified: What type of HSM system is used at DKRZ?
modified: Would it be possible/desirable to use only one command for slk and slk_helpers main classes?
modified: Where can I use slk and slk_helpers?
modified: How do I login to the HSM system?
modified: Can I use the command line tool non-interactively?
modified: Can I provide a file list to “slk archive” such as “-T” for “tar”?
modified: I am member of a project but cannot access this projects data?
modified: Is there a python interface available?
modified: How long are the search IDs stored?
modified: What is the “StrongLink Query Language”?
modified: Can the search ID of slk search be captured by a shell variable?
v1.41, 01 December 2023#
v1.40, 16 January 2023#
modified: Are there requirements on the file size for the tape archival?
modified: How do I automatically/non-interactively check whether I own a valid slk login token?
modified: Can a user influence if data is written into the HSM cache or onto tape?
modified: How fast can be read from the HSM?
modified: Does StrongLink automatically check the integrity of archived and retrieved files?
modified: Is the metadata of files within zip/tar files evaluated/ingested?
removed: What are the main differences compared to the old system?
removed: Why did DKRZ get a new system?
removed: Is the new HSM system accessbile via pftp?
removed: When did the new HSM system go online?
removed: Are my archived data available on the new system?
removed: How do I find out whether I have data from DXUL that have to be copied manually?
removed: How do I access DXUL data after the HPSS is shut down?
removed: How to proceed if I still have DXUL data that need to be kept?
removed: Can I still use pftp to interact with the new HSM system
v1.39, 08 December 2022#
v1.38, 18 October 2022#
modified: How do I interact with the new system?
modified: Where can I use slk and slk_helpers?
modified: Do I have write access to the archive from outside the DKRZ?
modified: How do I create directories in the HSM?
modified: Does StrongLink automatically check the integrity of archived and retrieved files?
modified: How do I automatically/non-interactively check whether I own a valid slk login token?
modified: What is the “StrongLink Query Language”?
modified: Can the search ID of slk search be captured by a shell variable?
removed: Why is no exact time schedule for training and migration published yet?
removed: Does the tape quota (/arch, /doku), which was assigned to my computing time project, remain unchanged?
new: Why does slk list hang when I want to list the results of a search?
v1.37, 14 June 2022#
grammar and spelling corrections in various questions
v1.36, 07 June 2022#
v1.35, 03 June 2022#
modified: Who do I contact when I have questions or issues regarding the new HSM system and its usage?
modified: How do I interact with the new system?
modified: How do I find out whether I have data from DXUL that have to be copied manually?
modified: Where can I use “slk” and “slk_helpers”?
modified: How do I login to the HSM system?
modified: Do I have to provide my login credentials each time I use the command line tool?
modified: From which file types are extended metadata harvested?
modified: Is there any possibility to move around in the file system with something like the cd command?
modified: Are there requirements on the file size for the tape archival?
modified: How to I check for how long my login token is still valid?
modified: Can a user run multiple archival and retrieval requests at a time?
modified: Is there a python interface available?
modified: Can a user influence if data is written into the HSM cache or onto tape?
modified: Does the packems package work with the new HSM system?
modified: Is it possible to use listems to list files that were archived with packems on the HPSS?
modified: How fast can be read from the HSM?
modified: Does StrongLink automatically check the integrity of archived and retrieved files?
modified: Is there an option to continue archiving if it was interrupted?
modified: Is it possible to use unpackems to retrieve files that were archived with packems on the HPSS?
renamed: from Where on mistral and levante should I run slk? to Where on Levante should I run slk?
renamed: from Is a file stored in the HSM cache or already exclusively on tape? to Is a file stored in the HSM cache or exclusively on tape?
v1.34, 19 April 2022#
modified: Where can I use slk and slk_helpers?
modified: Can a user run multiple archival and retrieval requests at a time?
modified: What is the “StrongLink Query Language”?
modified: Can the search ID of slk search be captured by a shell variable?
modified: Does the packems package work with the new HSM system?
modified: Is it possible to use listems to list files that were archived with packems on the HPSS?
modified: Is it possible to use unpackems to retrieve files that were archived with packems on the HPSS?
v1.33, 30 March 2022#
modified: Is there a python interface available?
modified: Can a user run multiple archival and retrieval requests at a time?
new: I am member of a project but cannot access this projects data?
new: Why do I get “Exception …: lateinit property websocket has not been initialized”?
new: Is a file stored in the HSM cache already or exclusively on tape?
new: slk search does not find any resources although resources exist that seem to match the query
v1.32, 28 February 2022#
removed: Does the tape archive hardware also change?
removed: When did the HPSS go offline? / When does the HPSS go offline?
modified: How do I find out whether I have data from DXUL that have to be copied manually?
modified: Where can I use slk and slk_helpers?
modified: How do I login to the HSM system?
modified: Does StrongLink automatically check the integrity of archived and retrieved files?
modified: Do I manually need to check the integrity of archived and retrieved files?
modified: How to I check for how long my login token is still valid?
modified: Are there requirements on the file size for the tape archival?
modified: From which file types are extended metadata harvested?
modified: Is there a python interface available?
modified: How fast can be read from the HSM?
modified: Does the packems package work with the new HSM system?
modified: Is it possible to use listems to list files that were archived with packems on the HPSS?
modified: Is it possible to use unpackems to retrieve files that were archived with packems on the HPSS?
renamed: from Where on mistral and levante should I run slk? to Where on mistral should I run slk?
v1.31, 11 February 2022#
v1.30, 06 December 2021#
removed content of section Common Issues (moved to page Known Issues)
removed: error “conflict with jdk/…” when the slk module is loaded (moved to page Known Issues)
removed: slk needs a specific Java version (moved to page Known Issues)
removed: slk search yields RQL parse error (moved to page Known Issues)
removed: slk login asks me to provide a hostname and/or a domain (moved to page Known Issues)
removed: Session key has expired (moved to page Known Issues)
removed: Login Unsuccessful - Incorrect Credentials (moved to page Known Issues)
removed: Archival fails and Java NullPointerException in the log (moved to page Known Issues)
renamed: from Terminal cursor disapears after stopping a slk command to Terminal cursor disapears after stopping a slk command. How to get it back?
v1.29, 18 November 2021#
modified: Session key has expired
v1.28, 12 November 2021#
modified: How do I create directories in the HSM?
v1.27, 11 November 2021#
renamed: from What type of HSM system will be installed? to What type of HSM system is used at DKRZ?
renamed (and modified): from Why is DKRZ getting a new system? to Why did DKRZ get a new system?
modified: What are the main differences compared to the old system?
renamed: from Will the new HSM system be accessbile via pftp? to Is the new HSM system accessbile via pftp?
modified: Does the tape archive hardware also change? (removed in FAQ version 1.32)
removed: Will there be a continuous changeover from HPSS to StrongLink HSM?
renamed (and modified): from When does the new HSM system go online? to When did the new HSM system go online?
renamed: from When does the HPSS go offline? to When did the HPSS go offline? (removed in FAQ version 1.32)
removed: When will an exact time schedule for the migration be published?
renamed (and modified): from Will all my archived data be available on the new system? to Are my archived data available on the new system?
removed: Where do I find data from the DXUL archive now?
modified: How do I access DXUL data after the HPSS is shut down?
modified: How to proceed if I still have DXUL data that need to be kept?
removed: What do I do with simulation results during the downtime between HPSS going offline and StrongLink going online?
renamed (and modified): from Will there be an introduction session to the new HSM system and its usage? to Has there been an introduction session to the new HSM system and will there be such sessions in future?
modified: Why is no exact time schedule for training and migration published yet?
modified: Where can I use slk and slk_helpers?
modified: Do I have to provide my login credentials each time I use the command line tool?
removed: How do I use the HSM/StrongLink test system?
modified: Do I manually need to check the integrity of archived and retrieved files?
modified: Can a user run multiple archival and retrieval requests at a time?
modified: Are there requirements on the file size for the tape archival?
renamed: from From which file types is extended metadata harvested? to From which file types are extended metadata harvested?
modified: Which metadata fields are harvested from netCDF files?
removed: Why does slk search show more search results than slk list lists for this search id?
modified: Is the metadata of files within zip/tar files evaluated/ingested?
modified: Is it possible to create symlinks between lustre_path/files and tape_path/files?
renamed: from Will it be possible to use listems to list files that were archived with packems on the HPSS? to Is it possible to use listems to list files that were archived with packems on the HPSS?
renamed: from Will it be possible to use unpackems to retrieve files that were archived with packems on the HPSS? to Is it possible to use unpackems to retrieve files that were archived with packems on the HPSS?
v1.26, 01 November 2021#
removed: Will I be able to see how the new HSM system will look like before it becomes productive?
removed: Will DKRZ users be able to test their archiving workflows before the new system goes online?
modified: When does the new HSM system go online?
modified: Why is no exact time schedule for training and migration published yet?
modified: Can I still use pftp to interact with the new HSM system?
renamed: from How will I interact with the new system? to How do I interact with the new system?
modified: Do I have write access to the archive from outside the DKRZ?
renamed: from Will it be possible to archive into my existing folder structure created on HPSS? to Is it possible to archive into my existing folder structure created on HPSS?
renamed: from Will there be a “double” storage feature as for HPSS? to Is there a “double” storage feature as for HPSS?
renamed: from From which file types will extended metadata be harvested? to From which file types is extended metadata harvested?
renamed: from Which metadata fields will be harvested from netCDF files? to Which metadata fields are harvested from netCDF files?
modified: Is there a python interface available?
renamed: from Will the packems package work with the new HSM system? to Does the packems package work with the new HSM system?
v1.25, 27 October 2021#
modified: When does the new HSM system go online?
modified: Why is no exact time schedule for training and migration published yet?
modified: Why does slk search show more search results than slk list lists for this search id? (removed in FAQ version 1.27)
modified: Can the search ID of slk search be captured by a shell variable?
v1.24, 23 October 2021#
modified: When does the new HSM system go online?
modified: Why is no exact time schedule for training and migration published yet?
v1.23, 15 October 2021#
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Why is no exact time schedule for training and migration published yet?
v1.22, 08 October 2021#
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Why is no exact time schedule for training and migration published yet?
v1.21, 01 October 2021#
new: Archival fails and Java NullPointerException in the log
v1.20, 29 September 2021#
modified: Why is no exact time schedule for training and migration published yet?
v1.19, 20 September 2021#
changed title of FAQ
corrected FAQ’s Changelog
v1.18, 17 September 2021#
added cross-references
minor layout changes
v1.17, 17 September 2021#
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Why is no exact time schedule for training and migration published yet?
modified: Who do I contact when I have questions or issues regarding the new HSM system and its usage?
v1.16, 17 August 2021#
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Why is no exact time schedule for training and migration published yet?
v1.15, 30 July 2021#
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Why is no exact time schedule for training and migration published yet?
new: How do I use the HSM/StrongLink test system? (removed in FAQ version 1.27)
new: Session key has expired
new: Login Unsuccessful - Incorrect Credentials
v1.14, 12 July 2021#
new: Will the new HSM system be accessbile via pftp?
new: Would it be possible/desirable to use only one command for slk and slk_helpers main classes?
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Will there be an introduction session to the new HSM system and its usage?
modified: Will DKRZ users be able to test their archiving workflows before the new system goes online?
modified: Why is no exact time schedule for training and migration published yet?
new: Do I have write access to the archive from outside the DKRZ?
new: Will the new system be available as a Globus endpoint for external transfers?
modified Does StrongLink automatically check the integrity of archived and retrieved files?
new: Can I provide a file list to “slk archive” such as “-T” for “tar”?
new: Can a user run multiple archival and retrieval requests at a time?
new: Are there requirements on the file size for the tape archival?
modified: Why does slk search show more search results than slk list lists for this search id? (removed in FAQ version 1.27)
new: Is there any possibility to move around in the file system with something like the cd command?
new: Can you work directly with files in the archive (e.g. with Python)?
v1.13, 29 June 2021#
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Why is no exact time schedule for training and migration published yet?
v1.12, 08 June 2021#
modified: How will I interact with the new system?
modified: Where can I use slk and slk_helpers?
modified: How do I create directories in the HSM?
modified: Does StrongLink automatically check the integrity of archived and retrieved files?
modified: Do I manually need to check the integrity of archived and retrieved files?
modified: Is there an option to continue archiving if it was interrupted?
modified: How do I automatically/non-interactively check whether I own a valid slk login token?
modified: How to I check for how long my login token is still valid?
modified: Why does slk search show more search results than slk list lists for this search id? (removed in FAQ version 1.27)
modified: Can the search ID of slk search be captured by a shell variable?
v1.11, 06 May 2021#
new: How to I check for how long my login token is still valid?
modified: Will I be able to see how the new HSM system will look like before it becomes productive?
modified: Will there be an introduction session to the new HSM system and its usage?
modified: Will DKRZ users be able to test their archiving workflows before the new system goes online?
modified: Why is no exact time schedule for training and migration published yet?
modified: How do I login to the HSM system?
modified: Can I use the command line tool non-interactively?
modified: How do I create directories in the HSM?
modified: What does “namespace”, “global namespace” or “gns” mean?
modified: Are the search IDs user specific?
modified: How do I automatically/non-interactively check whether I own a valid slk login token?
rephrased question: What do I do with simulation results during the downtime between HPSS going offline and StrongLink going online? (removed in FAQ version 1.27)
rephrased question: How do I determine the id (uid) of a DKRZ user?
rephrased question: How do I determine the id (gid) of a DKRZ group?
rephrased question: How do I determine the username of a DKRZ user when I have her/his id (uid)?
rephrased question: How do I determine the group name of a DKRZ group when I have its id (gid)?
rephrased question: How do I determine the MIME type of a file?
v1.10, 23 April 2021#
new: Why does slk search show more search results than slk list lists for this search id? (removed in FAQ version 1.27)
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Will there be an introduction session to the new HSM system and its usage?
modified: Will DKRZ users be able to test their archiving workflows before the new system goes online?
removed: slk is called in a directory in which the user has no write permissions
v1.09, 06 April 2021#
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Where do I find data from the DXUL archive now? (removed in FAQ version 1.27)
modified: Will there be an introduction session to the new HSM system and its usage?
modified: Why is no exact time schedule for training and migration published yet?
modified: Can I use the command line tool non-interactively?
modified: Is it possible to use slk chmod and slk group (=chgrp) commands recursively by the user?
v1.08, 12 March 2021#
v1.07, 10 March 2021#
modified: Why is DKRZ getting a new system?
modified: What are the main differences compared to the old system?
modified: Where can I find written documentation about the new HSM system?
modified: Why is no exact time schedule for training and migration published yet?
modified: How will I interact with the new system?
modified: How do I login to the HSM system?
modified: Do I have to provide my login credentials each time I use the command line tool?
modified: Does StrongLink automatically check the integrity of archived and retrieved files?
modified: Do I manually need to check the integrity of archived and retrieved files?
modified: Can the search ID of slk search be captured by a shell variable?
modified: Is the metadata of files within zip/tar files evaluated/ingested?
modified: error “conflict with jdk/…” when the slk module is loaded
v1.06, 08 March 2021#
modified: How do I create directories in the HSM?
new: What to do with simulation results during the downtime between HPSS going offline and StrongLink going online? (removed in FAQ version 1.27)
new: Is there an option to continue archiving if it was interrupted?
new: Does any command exist for deleting files immediately from /work in case of successful archival?
new: Will it be possible to archive into my existing folder structure created on HPSS?
new: What does “*namespace*”, “*global namespace*” or “*gns*” mean?
new: Is it possible to use chmod and chgrp commands recursively by the user?
modified: How do I get the username of a DKRZ user when I have her/his id (uid)?
modified: How do I get the group name of a DKRZ group when I have its id (gid)?
new: Can the search ID of slk search be captured by a shell variable?
new: Is the metadata of files within zip/tar files evaluated/ingested?
new: Is it possible to create symlinks between lustre_path/files and tape_path/files?
new: error “conflict with jdk/…” when the slk module is loaded
new: slk login asks me to provide a hostname and/or a domain
v1.05, 23 February 2021#
new: Does StrongLink automatically check the integrity of archived and retrieved files?
new: Do I manually need to check the integrity of archived and retrieved files?
new: How do I get the username of a DKRZ user when I have her/his id (uid)?
new: How do I get the group name of a DKRZ group when I have its id (gid)?
new section Common Issues
v1.04, 22 February 2021#
v1.03, 18 February 2021#
modified: When does the new HSM system go online?
modified: When does the HPSS go offline? (removed in FAQ version 1.32)
modified: Where can I find written documentation about the new HSM system?
v1.02, 12 February 2021#
modified: When does the new HSM system go online?
new: When will an exact time schedule for the migration be published? (removed in FAQ version 1.27)
new: Will I be able to see how the new HSM system will look like before it becomes productive?
modified: Will there be an introduction session to the new HSM system and its usage?
new: Why is no exact time schedule for training and migration published yet?
v1.01, 28 January 2021#
first public version