slk helpers: slk extension provided by DKRZ#

file version: 25 Feb 2025

current software versions: slk_helpers version 1.13.3

The slk_helpers is an extensions to the slk. The slk is developed by StrongLink and belongs to the StrongLink HSM software. The slk_helpers have been developed at the DKRZ to provide some useful functionality that is not included in the slk. The slk_helpers is loaded when slk is loaded with module load slk. If specific usage information is missing on this help page or if you encounter errors, please contact


StrongLink uses the term “namespace” or “global namespace” (gns). A “namespace” is comparable to a “directory” or “path” on a common file system.


$ slk_helpers (--pid|--help [COMMAND]|COMMAND ....)
  • --help: print help for COMMAND if specified and print general help otherwise

  • --pid: print the process id of the slk_helpers command

slk_helpers help#

print help

$ slk_helpers help

lists all commands

slk version#

print version

$ slk version

print the current slk_helpers version

slk_helpers checksum#

return checksums of resource; targets one resource only

$ slk_helpers checksum [-t CHECKSUM_TYPE] (RESOURCE_PATH|--resource-id RESOURCE_ID)
  • --resource-id: get type of a file with given resource id instead of path; default: -1

  • -t, --type: checksum_type (possible values: sha512, adler32); omit to print all available checksums

Prints the checksum(s) of a resource. If -t is set, the checksum of type CHECKSUM_TYPE is retrieved. Possible values are sha512 and adler32. If -t is not set, all available checksums are printed. It only works for files and not for namespaces. Namespaces do not have checksums.

StrongLink calculates two checksums of each archived file and stores them in the metadata. It compares the stored checksums with the file’s actual checksums at certain stages of the archival and retrieval process. Commonly, users do not need to check the checksum manually. But, you can if you prefer to do it. If a file has no checksum then it has not been fully archived yet (e.g. the copying is still in progress; archival process canceled).

slk_helpers exists#

check if resource exists; targets one resource only

$ slk_helpers exists RESOURCE_PATH

Check if the resource RESOURCE_PATH exists. The resource id is returned if it exists. exists works for files and namespaces.

slk_helpers gen_file_query#

generates a search query JSON string for provided resource list

$ slk_helpers gen_file_query [-R] RESOURCE1 [RESOURCE2 [RESOURCE3 [...]]]
  • --cached-only: Search for files in the HSM cache; Default: false

  • -n / --no-newline: Do not print a newline in the end of the output; Default: false

  • --not-cached: Search only for files which must not be in the HSM cache; currently ignored / no function; Default: false

  • -R, --recursive: generate a query which does a recursive search

  • --tape-barcodes BARCODE1 [BARCODE2 [BARCODE3 [...]]]: Search only for files stored on tapes with the provided barcodes

Generates a search query which can be used with slk search to perform a search for the resources RESOURCE1, RESOURCE2, … . These can be either files or namespaces. If a filename without path is provided, then the file will be searched for everywhere in the HSM. Filenames may contain regular expressions but no bash wildcards/globs. The path to a file must not contain regular expressions.

The user can specify whether files only from selected tapes (--tape-barcodes ...), from the HSM cache (--cached-only) or not in the HSM cache (--not-cached) are to be retrieved.

Detailed examples and explanations are given in Generate search queries.

slk_helpers gen_search_query#

generates a search query JSON string searching provided fields

$ slk_helpers gen_search_query [-R] fieldname=value [fieldname=value [fieldname=value [...]]] --search-query '[existing search query]'
  • fieldname commonly consists of schema.field except when you search for a path or a smart_pool; see Reference: metadata schemata for all available metadata fields and their types

  • value is the value to search for; gen_search_query converts it to the correct type if needed

  • =: instead of = also <, >, <= and >= can be used; please set the whole condition 'fieldname<value' in quotation marks if another operator than = is used

  • --search-query [...]: insert an existing search query which is connected via an and operator with the newly generated search query

  • -R, --recursive: generate a query which does a recursive search when the metadata fieldname path is used; -R has no effect if path is not used

Generates a search query which can be used with slk search to perform a search for files which fulfill the provided conditions.

Detailed examples are given in Generate search queries.

slk_helpers gfbt#

please see group_files_by_tape

slk_helpers group_files_by_tape#

check on which tapes provided resources are stored on and return grouping of resources per tape

$ slk_helpers group_files_by_tape (<RESOURCE_PATH_1> [<RESOURCE_PATH_2> [...]]|--resource-ids <RESOURCE_ID_1> [<RESOURCE_ID_2> [...]]|--search-id <SEARCH_ID>|--search-query <SEARCH_QUERY>) [-R] [-l|--list] [-c|--count-files] [--gen-search-query|--run-search-query] [--print-tape-barcode|--print-tape-id] [--print-tape-status] [--json|--json-pretty] [(--smtnps|--set-max-tape-number-per-search) <N>]

<RESOURCE_PATH_1> [<RESOURCE_PATH_2> [...]] (<list of GNS paths>) or --search-id <SEARCH_ID> or --search-query <SEARCH_QUERY> are mandatory as input. A combination of both is not allowed.

select type of input:

  • <RESOURCE_PATH_1> [<RESOURCE_PATH_2> [...]]: provide one or more paths to files or directories; directories only work with -R; filenames might contain regular expressions

  • -R, --recursive: Search namespaces recursively for input files

  • --regex, --evaluate-regex-in-input:

  • --resource-ids <RESOURCE_ID_1> [<RESOURCE_ID_2> [...]]: Use an existing search as input

  • --search-id <SEARCH_ID>: Use an existing search as input

  • --search-query <SEARCH_QUERY>: Use a search query as input

select output format:

  • none: print a human-readable list with one tape per row

  • --count-tapes: only print the number of tapes; two lines are printed

  • --json: print the output as JSON (one line; see --json-pretty for pretty json)

  • --json-pretty: print the output as pretty JSON

  • --print-resource-id: print the resource id for each file instead of its path; is ignored when --gen-search-query, --run-search-query, --full or --count-files are set.

  • -v / --print-progress: verbosity level 1; print information on the progress of performed searches and similar

  • -vv : verbosity level 2; print more detailled information on the progress of performed searches and similar

  • -wf1 <DESTINATION_RETRIEVAL> / --retrieval-workflow-1 <...>: shortcut for -dst <...>, --wrid, -ns, --details and --count-files => print resource number per tape, write resources ids into file with one file per tape, create config file for recall and retrieve watchers

select what should be done (basic):

  • -d / --details: print details per tape; implies --print-tape-barcode and --print-tape-status

  • -c / --count-files: counts the files per tape and prints this number instead of a file list

  • -f / --full: print details and run a search per tape; implies --print-tape-barcode, --print-tape-status and --run-search-query

select what should be done (advanced):

  • -ao / --append-output: when -wrid is set and target files already exist, append output to them (error otherwise)

  • -dst <DESTINATION_RETRIEVAL> / --destinationPath <DESTINATION_RETRIEVAL>: ignore files which exist already in DESTINATION_RETRIEVAL

  • --gen-search-query: generate and print search query strings instead of the lists of files per tape

  • -ns: preserve original namespace in destinationPath

  • -oo / --overwrite-output: when -wrid is set and target files already exist, append output to them (error otherwise)

  • --print-tape-id: print the tape id on the far left followed by a :, Default: false

  • --print-tape-id: print the tape id on the far left followed by a :, Default: false

  • --print-tape-status: print the status (AVAILAVLE, BLOCKED or ERRORSTATE) of the tape of each file group. Additional special stati are UNAVAILABLE and UNCLEAR. The meaning of the statis is given in Tape Stati below.

  • --run-search-query: generate and run search query strings; print the resulting search ID instead of the lists of files per tape

  • --smtnps <N> / --set-max-tape-number-per-search <N>: set the maximum number of tapes N which are used per search; default: 1; max: 2

  • -wrid / --write-resource-id: write resource ids per tape to text files with names files_tape_<tape barcode>.txt; further created files are files_all.txt (all resource ids), tapes.txt (all tapes for which the first type of file are created) and (parameters for watcher scripts); possibly, the files files_multipleTape.txt, files_notStored, files_ignored and/or files_cached are created

Receives a list of files or a search id as input. Looks up which files are stored in the HSM cache and which are not stored in the HSM cache but only on tape. Files on tape are grouped by tape: each line of the output contains all files which are on one tape. If the user wants to know the tape barcode and the tape status, she/he might use --print-tape-barcode and --print-tape-status, respectively. The flag --details implies both. The meaning of the statis is given in Tape Stati below. The user can directly create a search query for retrieving all files from one tape (--gen-search-query) or directly run this search (--run-search-query). The flag --full implies --run-search-query and --details. Additionally, the user can set --set-max-tape-number-per-search 2 to run one seach for two tapes each.

When you want to use the new slk_helpers retrieve and slk_helpers recall commands, you might choose to run gfbt with -wf1 <DESTINATION_RETRIEVAL>. You will receive multiple files – one file per tape – containing the ids of the resources stored on the respective tape. These files can be simply piped into the new recall or retrieve commands as follows:

$ slk_helpers gfbt -wf1 /scratch/k/k204221/blub
$ ls *.txt
$ cat files_tape_C42350L6.txt | slk_helpers recall --resource-ids
$ cat files_cached.txt | slk_helpers retrieve --resource-ids -d /scratch/k/k204221/blub


Please contact if you encounter a tape with ERRORSTATE.

Structure of the output (if --count-tapes is not set):


The row with cached in only printed if cached data are available. The status is always AVAILABLE. The row multi-tape in only printed if at least one file is stored on multiple tapes. The row not stored is only printed when files without storage information are present. Multiple rows with tape might be printed – one row per tape.

The output looks as follows when --count-tapes is set:

N tapes with single-tape files
M tapes with multi-tape files

Where N is the number of tapes with single-tape-only files (== number of normal tape) and M is the number of tapes onto which files in the multi-tape category are stored on.

slk_helpers hsm2json#

export resource metadata and return them in JSON format

hsm2json [options] <GNS path>
  • --instant-metadata-record-output: not set: read the metadata records of all specified files and print them when the last record is read; if set: print a metadata record directly after it had been read. Needs -l/–write-json-lines to be set. Default: false

  • -o FILE, --outfile FILE: Write the output into a file instead to the stdout

  • -q, --quiet: print nothing to stdout (e.g. no summary), Default: false

  • -R, --recursive: export metadata from the HSM recursively (all files in sub-directories of the provided source path will be considered), Default: false

  • -r FILE, --restart-file FILE: set a restart file in which the processed metadata entries are listed (if restart file exists, listed files will be skipped)

  • -s SCHEMA[,SCHEMA[...]], --schema SCHEMA[,SCHEMA[...]]: import only metadata fields of listed schemata (comma-separated list without spaces)

  • -v, --verbose: activate verbose mode, Default: false

  • -l, --write-json-lines: write JSON-lines instead of normal JSON, Default: false

  • -m MODE, --write-mode MODE: select write mode when -o/--outfile is set, Default: ERROR, Possible Values: [OVERWRITE, ERROR]

  • --print-summary: print summary on how many metadata records have been processed

  • --write-compact-json: do not print metadata as pretty but as compact JSON; default is pretty JSON

Extracts metadata from HSM file(s) and returns them in JSON structure. See JSON structure for/of metadata import/export for details.

slk_helpers hostname#

return hostname to which slk_helpers connect

$ slk_helpers hostname

Prints the hostname to which slk is currently connected to or to which slk will connect. It should be This is the default value on each Levante node. You can overwrite the default hostname by exporting the environment variable SLK_HOSTNAME (e.g. by export on bash).

slk_helpers iscached#

check if resources are cached

$ slk_helpers iscached [-v] [-vv] (RESOURCE_PATH [RESOURCE_PATH [...]]|--resource-id RESOURCE_ID|--search-id SEARCH_ID) [-R]
  • -R: search recursively in RESOURCE_PATH for files if RESOURCE_PATH is a namespaces/directory

  • --resource-id: check caching status of file with provided RESOURCE_ID instead of RESOURCE_PATH and SEARCH_ID; default: -1

  • --search-id SEARCH_ID: checking caching status of all files represented by provided SEARCH_ID instead of file with RESOURCE_PATH and a RESOURCE_ID; default: -1

  • -v: verbose mode; print list of non-cached files (== non-matching files) and a summary line

  • -vv: double verbose mode; print list of checked files incl. their status (is cached and is not cached) and a summary line


Please provide either RESOURCE_PATH or --search-id SEARCH_ID or --resource-id RESOURCE_ID.

Checks if the resource RESOURCE_PATH is stored in the HSM cache. Accepts multiple RESOURCE_PATH``s. The user is informed via a text message whether ``RESOURCE_PATH exists. Additionally, the exit code will be 0 if the resource is in the cache and 1 if not (exit code: get the variable $? directly after the slk call). When --search-id SEARCH_ID is set, more than one file might be checked. If at least one file is not cached, 1 is return. 0 is only returned when all files are cached.

If a file is not stored in the cache then it is only stored on tape. Retrievals from tape will take considerable longer than retrievals from cache.

slk_helpers is_admin_session#

check if the current user has admin permissions in StrongLink

$ slk_helpers is_admin_session

Check if the user is currently logged in as admin to StrongLink. Not useful for normal users. Might be used to check whether a connection to StrongLink is possible.

slk_helpers is_on_tape#

check if resources are stored on tape

$ slk_helpers is_on_tape [-v] [-vv] (RESOURCE_PATH [RESOURCE_PATH [...]]|--resource-id RESOURCE_ID|--search-id SEARCH_ID) [-R]
  • -R: search recursively in RESOURCE_PATH for files if RESOURCE_PATH is a namespaces/directory

  • --resource-id: check tape storage status of file with provided RESOURCE_ID instead of RESOURCE_PATH and SEARCH_ID; default: -1

  • --search-id SEARCH_ID: checking tape storage status of all files represented by provided SEARCH_ID instead of file with RESOURCE_PATH and a RESOURCE_ID; default: -1

  • -v: verbose mode; print list of files not on tape (== non-matching files) and a summary line

  • -vv: double verbose mode; print list of checked files incl. their status (is on tape and is not on tape) and a summary line


Please provide either RESOURCE_PATH or --search-id SEARCH_ID or --resource-id RESOURCE_ID.

Checks if the resource RESOURCE_PATH is stored on tape. If a file is stored on tape and in the cache the this command will return true / on tape. Accepts multiple RESOURCE_PATH``s. The user is informed via a text message whether ``RESOURCE_PATH exists. Additionally, the exit code will be 0 if the resource is stored on a tape and 1 if not (exit code: get the variable $? directly after the slk call). When --search-id SEARCH_ID is set or RESOURCE_PATH / --resource-id RESOURCE_ID is a namespaces, more than one file might be checked. If at least one file is not on tape, 1 is return. 0 is only returned when all files are on tape.

If a file is not stored on tape then it is only stored in cache. Based on this command’s output it is not possible to determine whether a file is also stored in the HSM cache or not.

slk_helpers json2hsm#

read metadata from JSON file and attach them to resource in StrongLink

json2hsm [options] <SL-JSON metadata file> <GNS path>
  • -l, --expect-json-lines: consider the input file to be JSON-lines instead of normal JSON, Default: false

  • --ignore-non-existing-metadata-fields: if set: if a metadata field of the source metadata record does not exist in StrongLink then this metadata field is skipped. if not set: throw an error and exit as soon a source metadata field does not exist in StrongLink. If this flag is not set but -k/--skip-bad-metadata-sets is set, then metadata records with non-existing metadata fields will be skipped. Default: false

  • --instant-metadata-record-update: not set: read the whole JSON file and collect all metadata updates => apply all updates in the end; if two metadata records exist for one resource, this will become apparent before any metadata are written; if set: write each metadata record to StrongLink directly after it has be read from the JSON file; if two metadata records exist for one resource, the first metadata record will be written to StrongLink and the duplication will remain undetected until the duplicate record is read from JSON. Default: false

  • -q, --quiet: print nothing to stdout (e.g. no summary), Default: false

  • -r FILE, --restart-file FILE: set a restart file in which the processed metadata entries are listed (if restart file exists, listed files will be skipped)

  • -s SCHEMA[,SCHEMA[...]], --schema SCHEMA[,SCHEMA[...]]: import only metadata fields of listed schemata (comma-separated list without spaces)

  • -k, --skip-bad-metadata-sets: skip damaged / incomplete metadata sets [default: throw error], Default: false

  • -v, --verbose: activate verbose mode, Default: false

  • -m MODE, --write-mode MODE: select write mode for metadata, Default: OVERWRITE, Possible Values: OVERWRITE, KEEP, ERROR, CLEAN (CLEAN: first, delete all metadata from the target schema and, then, write new metadata)

Reads metadata from JSON will and write them to archived files into HSM. Uses relative paths from metadata records plus base path provided by the user to identify target files. See JSON structure for/of metadata import/export for details.

slk_helpers job_exists#

check if job with given ID exists in StrongLink

slk_helpers job_exists JOB_ID

Check if a tape read job (recall job) or verify job with the given ID exists.

slk_helpers job_queue#

print status information of the StrongLink queue

slk_helpers job_queue
  • -i <INTERPRET_TYPE> / --interpret <INTERPRET_TYPE>: interprete the length of the StrongLink recall job queue; possible values for INTERPRET_TYPE: * RAW / R: same as argument not set * TEXT / T: print short textual interpretation of the queue status => none, short, medium, long, jammed * DETAILS / D: print detailed textual interpretation of the queue status * NUMERIC / N: print a number representing the queue status => 0 (==none), 1 (==short), …, 4 (==jammed)

Prints length or the status of the queue of tape read jobs (recall jobs). The output looks like this:

$ slkh job_queue
total read jobs: 110
active read jobs: 12
queued read jobs: 98

$ slkh job_queue --interpret N

$ slkh job_queue --interpret T

or like this:

$ slkh job_queue
total read jobs: 4
active read jobs: 4
queued read jobs: 0

$ slkh job_queue --interpret N

$ slkh job_queue --interpret T

$ slk_helpers job_queue --interpret D
no queue, waiting time in the queue: none

slk_helpers job_report#


This command is not needed/recommended anymore and will be removed oder deactivated in future releases of slk_helpers. We recommend using the command result_verify_job instead.

extract basic report of a StrongLink job and print it

slk_helpers job_status#

return status of the StrongLink job

slk_helpers job_status JOB_ID

Check the status of a tape read job with the given ID. The status is one of these: ABORTED, QUEUED, PROCESSING, COMPLETED, SUCCESSFUL, FAILED and PAUSED. When the status is QUEUED then the place in the queue is appended in brackets, e.g.: QUEUED (12).

See Job Stati for descriptions of the job stati.

slk_helpers has_no_flag_partial#

check if resources are not flagged as partial files

$ slk_helpers has_no_flag_partial (RESOURCE_PATH [RESOURCE_PATH [...]]|--resource-id RESOURCE_ID|--search-id SEARCH_ID) [-R] [-v|-vv]
  • -R: search recursively in RESOURCE_PATH for files if RESOURCE_PATH is a namespaces/directory

  • --resource-id: check file with provided RESOURCE_ID instead of RESOURCE_PATH and SEARCH_ID; default: -1

  • --search-id SEARCH_ID: check all files represented by provided SEARCH_ID instead of file with RESOURCE_PATH and a RESOURCE_ID; default: -1

  • -v: single verbose mode; print list of files with flag (== non-matching files) and a summary line

  • -vv: double verbose mode; print list of checked files incl. their status (has no partial flag and has partial flag) and a summary line


Please provide either RESOURCE_PATH or --search-id SEARCH_ID or --resource-id RESOURCE_ID.

Checks if a resource RESOURCE_PATH is flagged as “partial file” and prints the resource path if this is the case. --invert inverts the command checking mechanism so that all files which are not flagged as “partial file” are printed. Accepts multiple RESOURCE_PATH``s. Additionally, the exit code will be ``0 if at least one match was found and 1 if no match was found (exit code: get the variable $? directly after the slk call).

slk_helpers list_clone_file#

print details on resources; similar than slk list but has additional options and prints more details

$ slk_helpers list_clone_search [--print-resource-ids] [--print-more-timestamps] [print-timestamps-as-seconds-since-1970] [--proceed-on-error] (--read-from-file <LOCAL FILE>|<RESOURCE_PATH> [<RESOURCE_PATH> [<RESOURCE_PATH> [...]]])
  • --print-resource-ids: print resource ids instead of file paths

  • --print-more-timestamps: print additional timestamps which are not supported when the results of a search id are printed,

  • --print-timestamps-as-seconds-since-1970: print timestamps in seconds since 1970

  • --proceed-on-error: Proceed listing files even if an error arose

  • --read-from-file <LOCAL FILE>: Read file list from file instead of command line arguments or stdin

Prints information on the provided <RESOURCE_PATH> s. Does not print the content of namespaces / folders but only information on the targeted resource itself.

If --read-from-file <LOCAL FILE> is set, the <RESOURCE_PATH> s are ignored.

The output consists of 10 columns:

  • col 1: permissions and storage location

  • col 2: uid / user id

  • col 3: gid / group id

  • col 4: size in byte

  • col 5: mtime of the file (time stamp of “last modification of the file prior to the archival”)

  • col 6: time stamp of first file version in StrongLink

  • col 7: stamp of current file version in StrongLink

  • col 8: time stamp of last StrongLink-internal copy process of this file (e.g. last recall)

  • col 9: tape id if file is stored on one tape

  • col 10: full path

slk_helpers metadata#

prints resource metadata; targets one resource only

$ slk_helpers metadata RESOURCE_PATH
  • --alternative-output-format: different format to print metadata (each row is: schema.field: value), Default: false

Prints the available metadata of a resource. Corresponds to slk tag – whereas slk tag sets metadata and slk_helpers metadata prints metadata.

slk_helpers mkdir#

create a namespace (directory) in StrongLink

$ slk_helpers mkdir [-R] GNS_PATH
  • -p / --parents: use the -p to create folders recursively, if the parent folders do not exist; throw no error if folder already exists (like ‘mkdir -p’)

    Default: false

  • -R: use the -R to create folders recursively, if the parent folders do not exist; throw an error if folder already exists

Creates a namespace in an already existing namespace (== create basename GNS_PATH in dirname GNS_PATH). This command works like mkdir on a Linux terminal. Create nested namespaces recursively when -R is set. slk_helpers mkdir -p behaves like mkdir -p on the Linux terminal.

slk_helpers print_rcrs#

print resource storage details

$ slk_helpers print_rcrs (RESOURCE_PATH|--resource-id RESOURCE_ID)
  • --resource-id: get rcrs of a file with given resource id instead of path

Gets the r**esource **c**ontent **r**ecord**s for a resource path or resource id. Some files which where archived by HPSS were split into two parts which were stored on different tapes. If these files are accessed via StrongLink each file part gets its own checksum. There will be no overall checksum stored for the combined file. Therefore slk_helpers checksum prints no checksums for such files. If you need to verify such split files after retrieval, you can get the size and checksum of each file part via this command and, then, split the file via split -b <SIZE> <FILE>. The command does not necessarily print the file part information in the correct order. The information on the second file part might be printed first.

slk_helpers recall#


Extended test phase of this command. Please report bugs and feature requests to the DKRZ support (

submit recall job

$ slk_helpers recall [<RESOURCE_PATH_1> [<RESOURCE_PATH_2> [...]]|<RESOURCE_ID_1> [<RESOURCE_ID_2> [...]]|<SEARCH_ID>] [(-d|--destination) <RETRIEVAL_DESTINATION>] [--dry-run] [-ns] [-R] [--resource-ids|--search-id] [--suppress-input-info] [-v|-vv]

The command attempts to read from stdin if neither <RESOURCE_PATH_1> ... nor <RESOURCE_ID_1> ... nor <SEARCH_ID> are provided. This means that you can also pipe data into this command.

  • -d <RETRIEVAL_DESTINATION>, --destination <...>: destination path; only files which do not exist in <destination> will be considered; only considered when individual source files are provided; if source is a namespace, this option is ignored

  • --dry-run: list which file would be recalled from the tape to cache but do not actually start recall job

  • -ns: preserve namespace in destination (takes effect only incombination with -dst / --destination)

  • -R: use the -R flag to recall recursively

  • --resource-ids: interprets the input as a list of resource ids

  • --search-id: interprets the input as a search id

  • --suppress-input-info: suppress user information when input is read from stdin

  • -v: single verbose mode

  • -vv: double verbose mode

Starts recall job for the provided resources. Returns recall job ID instantly after job has been submitted. Can recall from a maximum of four tapes at once. If you provide resource ids or a search id, they do not need to be provided directly after --resource-ids or --search-id. Instead, these two parameters are merely switches for the interpretation of the input. They also apply when resources are provided via stdin.

Set -d <RETRIEVAL_DESTINATION> so that only files which are not already in <RETRIEVAL_DESTINATION> are recalled. recall expects the files to be directly in <RETRIEVAL_DESTINATION>. If -ns is set, reconstructs the full paths of the resources in <RETRIEVAL_DESTINATION>.

slk_helpers recall_needed#


Extended test phase of this command. Please report bugs and feature requests to the DKRZ support (

check whether recall job would be submitted if recall was run

$ slk_helpers recall_needed [<RESOURCE_PATH_1> [<RESOURCE_PATH_2> [...]]|<RESOURCE_ID_1> [<RESOURCE_ID_2> [...]]|<SEARCH_ID>] [(-d|--destination) <RETRIEVAL_DESTINATION>] [--dry-run] [-ns] [-R] [--resource-ids|--search-id] [--suppress-input-info] [-v|-vv]

The command attempts to read from stdin if neither <RESOURCE_PATH_1> ... nor <RESOURCE_ID_1> ... nor <SEARCH_ID> are provided. This means that you can also pipe data into this command.

  • -d <RETRIEVAL_DESTINATION>, --destination <...>: destination path; only files which do not exist in <destination> will be considered; only considered when individual source files are provided; if source is a namespace, this option is ignored

  • --dry-run: list which file would be recalled from the tape to cache but do not actually start recall job

  • -ns: preserve namespace in destination (takes effect only incombination with -dst / --destination)

  • -R: use the -R flag to recall recursively

  • --resource-ids: interprets the input as a list of resource ids

  • --search-id: interprets the input as a search id

  • --suppress-input-info: suppress user information when input is read from stdin

  • -v: single verbose mode

  • -vv: double verbose mode

Checks whether a recall job was submitted for the provided resources if slk_helpers recall was run. Behaves slightly different than the recall command which is why this feature was implemented as an extra command instead of an option like --dry-run. Differences of recall_needed to recall are:

  • returns text (“_Resources need to be recalled._”/”_No resources to recall._”) instead of integer (recall job id)

  • does not check whether required tapes are available

If you provide resource ids or a search id, they do not need to be provided directly after --resource-ids or --search-id. Instead, these two parameters are merely switches for the interpretation of the input. They also apply when resources are provided via stdin.

Set -d <RETRIEVAL_DESTINATION> so that only files which are not already in <RETRIEVAL_DESTINATION> are recalled. recall_needed expects the files to be directly in <RETRIEVAL_DESTINATION>. If -ns is set, reconstructs the full paths of the resources in <RETRIEVAL_DESTINATION>.

slk_helpers resourcepath#


Might be deprecated soon. Please use slk_helpers resource_path instead

prints path of resource with provided id; targets one resource only

$ slk_helpers resourcepath RESOURCE_ID

slk_helpers resource_id#

prints id(s) of resource(s) with provided path(s)

$ slk_helpers resource_id [<RESOURCE_PATH_1> [<RESOURCE_PATH_2> [...]]|--read-from-file <FILE_WITH_RESOURCE_PATHS>] [--proceed-on-error]

The command attempts to read from stdin if neither <RESOURCE_PATH_1> ... nor --read-from-file <FILE_WITH_RESOURCE_PATHS> are provided. This means that you can also pipe data into this command.

  • --proceed-on-error: Proceed listing files even if an error arose

  • --read-from-file <FILE_WITH_RESOURCE_PATHS>: Read file list from file instead of command line arguments or stdin

Prints the resource path and the corresponding resource id separated by : ``. Prints one pair per line. Similar to ``exists

slk_helpers resource_path#

prints path of resource with provided id; targets one resource only

$ slk_helpers resource_path RESOURCE_ID

Gets path for a resource id and returns it.

slk_helpers resource_permissions#

prints resource permissions; targets one resource only

$ slk_helpers resource_permissions (RESOURCE_PATH|--resource-id RESOURCE_ID)
  • --resource-id: get type of a file with given resource id instead of path; default: -1

  • --as-octal-number: Do not return the permissions as combination of x , w, r and - but as three digit octal number.

Gets permissions for a resource path or resource id as combination of x , w, r and -

slk_helpers resource_type#

prints type of resource (file or namespace); targets one resource only

$ slk_helpers resource_type (RESOURCE_PATH|--resource-id RESOURCE_ID)
  • --resource-id: get type of a file with given resource id instead of path; default: -1

Gets the resource type (FILE or NAMESPACE) for a resource path or resource id

slk_helpers resource_tape#

print tape on which resource is stored on; targets one resource only

$ slk_helpers resource_type [--json] [--print-path] [--quiet|-q] (<RESOURCE_PATH>|--resource-id <RESOURCE_ID>)
  • --json: print output in JSON

  • --print-path: include the path of the file in the output

  • --quiet / -q: no summary line is printed

  • --resource-id: get type of a file with given resource id instead of path; default: -1

Returns on which tape(s) the provided file is stored on. Does not process more than one file as input and does not process the content of namespaces / folders recursively.

slk_helpers result_verify_job#

return the result of a finished verify job

$ slk_helpers result_verify_job [--header|--sources|--number-errors|--number-sources] <job_id>
  • --header: print header of the report instead of errors; Default: false

  • --json: print output as JSON; full output

  • --json-no-source: print output as JSON; drop source information; useful when job targeted 50000 files”,

  • --number-errors: print number of errors; Default: false

  • --number-sources: print number of source resources; note: if one resource was trageted, it might be one file or namespace; if the number of targeted resources is larger than one, all of them were files

  • --quick: disable additional file verification checks performed for the verify job’s results; not recommended; might reduce run time

  • --sources: print the sources (sources resources, source namespace)

  • -v / --verbose: verbosity level 1

  • -vv / --double-verbose / --verbose-verbose: verbosity level 2

Print verification errors collected by the a verify job with the id job_id. “Verification” means that the target size and the actual size of each targeted file are compared. Mismatches between these two sizes cause a verification error. The full report of a verification job, which can be extracet via slk_helpers job_report <job_id>, might contain additional warnings and errors, which are no relevant for the user.

This command performs some additional checks for the verified files since slk_helpers version 1.13.1 because a few rare issues are not detected by verify jobs. These rare issues arose less than 50 times for more than 20 mio archived files. When more than 5000 are targeted by a verify job, this new feature might considerably increase the run time of result_verify_job. The argument --quick can be set to deactivate this feature and reduce runtime. We discourage the usage of this argument.

slk_helpers retrieve#


Extended test phase of this command. Please report bugs and feature requests to the DKRZ support (

retrieve provided resources if they are cached but does not automatically start recall from tape

$ slk_helpers retrieve ((-d|--destination) <RETRIEVAL_DESTINATION>) [<RESOURCE_PATH_1> [<RESOURCE_PATH_2> [...]]|<RESOURCE_ID_1> [<RESOURCE_ID_2> [...]]|<SEARCH_ID>] [--dry-run] [-ns] [-R] [--resource-ids|--search-id] [--suppress-input-info] [-v|-vv]

The command attempts to read from stdin if neither <RESOURCE_PATH_1> ... nor <RESOURCE_ID_1> ... nor <SEARCH_ID> are provided. This means that you can also pipe data into this command.

  • --dry-run: list which file would be recalled from the tape to cache but do not actually start recall job

  • --ignore-existing: skip if file exists (even when it is a different file)

  • --json-batch: prints a JSON summary to the terminal muting all other output to stdout

  • --json-to-file: prints a JSON summary to a file (if you do not want to capture it together with verbose information)

  • --print-progress: prints the progress of this command to stderr (if you do not want to capture it together with JSON or verbose information)

  • --resource-ids: interprets the input as a list of resource ids

  • --run-as-slurm-job-with-account <ACCOUNT>: generate a SLURM job script, which will retrieve requested files (no automatic recall!) and submit it directly (only create it if --dry-run is set)

  • --search-id: interprets the input as a search id

  • --stop-on-failed-retrieval: stop immediately when one file cannot be retrieved

  • --suppress-input-info: suppress user information when input is read from stdin

  • --write-envisaged-to-file: write files, which could be retrieved, to provided file (failed files, which should be retrieved, are ignored)

  • --write-missing-to-file: write files, which currently cannot / could not be retrieve, to provided file

  • -d <RETRIEVAL_DESTINATION>, --destination <...>: destination path; only files which do not exist in <destination> will be considered; only considered when individual source files are provided; if source is a namespace, this option is ignored; requiered

  • -f, --force-overwrite: force overwrite of all existing files

  • -ns: preserve namespace in destination (takes effect only incombination with -dst / --destination)

  • -R: use the -R flag to recall recursively

  • -v: single verbose mode

  • -vv: double verbose mode

Starts retrieval of the provided resources to the provided destination for all resources which are in the cache. Does not start recalls for resources which are not in the cache. If you provide resource ids or a search id, they do not need to be provided directly after --resource-ids or --search-id. Instead, these two parameters are merely switches for the interpretation of the input. They also apply when resources are provided via stdin.

Retrieves only files which are not already in <RETRIEVAL_DESTINATION>. retrieve expects the files to be directly in <RETRIEVAL_DESTINATION>. If -ns is set, reconstructs the full paths of the resources in <RETRIEVAL_DESTINATION>.

Usage of --run-as-slurm-job-with-account is described here.

slk_helpers search_immediately#

start search and return search id immediately

$ slk_helpers search_immediately "RQL Search Query"
$ slk_helpers search_immediately 'RQL Search Query'

This command will start a background search, exits immediately after the search has been and returns the corresponding search id. The search query has to be specified by a query a query language that was designed by StrongLink. The query language is described on the StrongLink query language page and in the StrongLink Command Line Interface Guide from page 6 onwards.


Operators in queries start with a $. If a query is delimited by " then the $ has to be escaped by a leading \ (\$OPERATOR). Otherwise, the operator is interpreted as environment variable by the shell. Alternatively, use ' as delimiter.

slk_helpers search_incomplete#

check if submitted search has not completed yet


This command is work in progress and might be changed in future.

$ slk_helpers search_incomplete <SEARCH_ID>

Prints out whether the search, to which the SEARCH_ID points, is incomplete (== still running) or not (== finished). A complete search might be successful or failed (see slk_helpers search_successful).

slk_helpers search_limited#


This command will be deprecated soon. Please use slk search or, in special situations, slk_helpers search_immediately instead.

submit a search query

$ slk_helpers search_limited "RQL Search Query"
$ slk_helpers search_limited 'RQL Search Query'

This command will conduct a background search for files that match the specific query specified using a query language syntax that was designed by StrongLink. The query language is described on the StrongLink query language page and in the StrongLink Command Line Interface Guide from page 6 onwards.


Operators in queries start with a $. If a query is delimited by " then the $ has to be escaped by a leading \ (\$OPERATOR). Otherwise, the operator is interpreted as environment variable by the shell. Alternatively, use ' as delimiter.

A search result ID (search_id) will be returned if 1000 or less results were found. If more results were found, an error and no search ID will be printed. 1000 refers to the total number of results of which some might not be visible to the user. The search ID can be used to list and retrieve files from the archive (see below).

One might need the user or group ids of respective users/groups to search files belonging to them. These ids are obtained as follows.

Get user id:

# get your user id
$ id -u

# get the id of any user
$ id USER_NAME -u

# get the id of any user
$ getent passwd USER_NAME
#  OR
$ getent passwd USER_NAME | awk -F: '{ print $3 }'

# get user name from user id
$ getent passwd USER_ID | awk -F: '{ print $1 }'

Get group id:

# get group ID from group name
$ getent group GROUP_NAME
#  OR
$ getent group GROUP_NAME | awk -F: '{ print $3 }'

# get group name from group id
$ getent group GROUP_ID | awk -F: '{ print $1 }'

# get groups and their ids of all groups of which member you are
$ id

Please see our documentation of specific Usage Examples (slk Usage Examples) and the StrongLink Command Line Interface Guide for exemplary calls of slk search. These can be 1:1 used with slk_helpers search_limited.


slk_helpers search_limited counts all files and namespaces that match the search query and that the current user is allowed to see/read. slk list lists only the respective files. Therefore, slk_helpers search_limited might print an error that more than 1000 resources were found although there are less than 1000 matches of which the user has read permissions. Moreover, different users might get different output of slk list for the same search id.

slk_helpers search_status#

return status of search

$ slk_helpers search_status <SEARCH_ID>

Prints the status of a search.

slk_helpers search_successful#

check if search was successful


This command is work in progress and might be changed in future.

$ slk_helpers search_successful <SEARCH_ID>

Prints out whether the search, to which the SEARCH_ID points, was successful or not. A not-successful search might have failed or be incompleted (see slk_helpers search_incomplete).

slk_helpers searchid_exists#

check if search id exists

$ slk_helpers searchid_exists <SEARCH_ID>

Prints out whether the provided search ID exists or not.

slk_helpers session#

check if current slk session is valid

$ slk_helpers session

Prints until when the current slk session is valid.

slk_helpers size#

print size of resource; targets one resource only

$ slk_helpers size (RESOURCE_PATH|--resource-id RESOURCE_ID) [-R|--recursive] [--pad-spaces-left WIDTH] [-v|-vv]
  • --pad-spaces-left <width> pad spaces on the left of the printed size so that total width (spaces + number) is width; default: -1 (no padding)

  • -R / --recursive Calculate folder size by summing sizes of contained files recursively

  • --resource-id: get size of a file with given resource id instead of path; default: -1

  • -v single verbose mode: print sizes of all namespaces recursively

  • -vv double verbose mode: print sizes of all resources recursively

Returns file size in byte. If a namespace / directory is target and -R / --recursive is not set, 0 is returned. If a namespace / directory is target and -R / --recursive is set, the size is calculated recursively. If the resource does not exist, an error and exit code 1 are return. All other errors cause an exit code of 2.

slk_helpers submit_verify_job#

submits a verify job and return job id

$ slk_helpers submit_verify_job [-v] RESOURCE_PATH [RESOURCE_PATH [...]] [(-R|--recursive)] [--save-mode]
$ slk_helpers submit_verify_job [-v] --resource-ids RESOURCE_ID [RESOURCE_ID [...]] [(-R|--recursive)] [--save-mode]
$ slk_helpers submit_verify_job [-v] --search-id SEARCH_ID [--save-mode] [--resume-on-page <n>]
$ slk_helpers submit_verify_job [-v] --search-query 'SEARCH_QUERY' [--save-mode]
# currently, only for admin users:
$ slk_helpers submit_verify_job [-v] (-i|--infile|--input) JSON_VERIFY_JOB_FILE
  • -i / --infile / --input  JSON_VERIFY_JOB_FILE: a verify job can be described by a JSON expression; this JSON can be provided as file to this command

  • -j: print output as JSON

  • -R / --recursive: if a resource path or resource id points to a namespace, consider all resources in this namespace recursively

  • --resource-ids RESOURCE_ID [RESOURCE_ID [...]]: target resources by their resource ids instead of their resource paths

  • --resume-on-page <n>: resume the command and start submitting jobs starting with search result 1000 * n; internally, 1000 search results are on one ‘page’ and fetched by one request => therefore 1000 * n; you do not necessarily have read permissions for 1000 files per page

  • --save-mode: save mode suggested to be used in times of many timeouts; please do not regularly use this parameter; start one verify job per page of search results instead of one verify job for 50 pages of search results

  • --search-id SEARCH_ID: target resources which were found by this search

  • --search-query 'SEARCH_QUERY': target resources which will be found by a search defined by this search query

  • -v: verbose mode; print information on what is currently done recommended

Starts a verify job for the selected files. Files, for which the current user does not have read permissions, are automatically ignored. No error message or warning is printed if files are ignored. The result of the verify job can currently be fetched as verify report via slk_helpers result_verify_job. We strongly suggest to read Reference: StrongLink verify reports prior to evaluating a verify report the first time. The checked files are listed in the header of the verify report.

One verify job is limited to 50000 resources because the run time of the job considerably increases for higher number of resources. If the verification of more than 50000 files is requested, multiple verify jobs are submitted. All job ids are printed out – one job id per line. A verify job targeting 50000 approximately runs 6 minutes.

Verify jobs are submitted to the same StrongLink-internal queue to which also retrieval/recall jobs are submitted. Thus, if 100 retrieval/recall jobs wait in the queue then new verify jobs will line up in the end and need to wait a long time. No new verify jobs can be submitted by non-admin users if already two or more jobs run for their user name. If one submit_verify_job command call wants to submit multiple verify jobs, which number does exceed the limit of two jobs per user, the command is allowed to do so if at least one job slot is empty. Thus, more than two verify jobs might be running in certain situations.


The option -i / --infile / --input is currently deactivated for normal users because via this parameter a few options could be set for a verify job which might be harmful for the speed or stability of the StrongLink system. When it will be possible in future to limit the usage of these options, we might release this parameter for general usage.

slk_helpers tape_barcode#

return barcode of tape with provided id

$ slk_helpers tape_barcode TAPE_ID

Returns the barcode of a tape with tape id TAPE_ID if it exists.

slk_helpers tape_exists#

check if tape exists

$ slk_helpers tape_exists (TAPE_ID|--tape-barcode TAPE_BARCODE)

Returns whether the tape with tape id TAPE_ID or tape barcode TAPE_BARCODE exists in the tape library or not.

slk_helpers tape_id#

return id of tape with provided barcode

$ slk_helpers tape_id TAPE_BARCODE

Returns the ID of a tape with tape barcode TAPE_BARCODE if it exists.

slk_helpers tape_library#

print name of tape library in which a tape is located in

$ slk_helpers tape_library (TAPE_ID|--tape-barcode TAPE_BARCODE)

Returns the name of the tape library in which the tape is stored in.

slk_helpers tape_status#

print status of a tape

$ slk_helpers tape_status [--details] (TAPE_ID|--tape-barcode TAPE_BARCODE)
  • --details: print a more detailled description of the retrieval status (different states of AVAILABLE are possible)

Prints the status of a tape with tape id TAPE_ID or tape barcode TAPE_BARCODE for retrievals: AVAILABLE, BLOCKED or ERRORSTATE. The meaning of the statis is given in Tape Stati below. Please contact if you encounter a tape with ERRORSTATE.

slk_helpers tnsr#

please see total_number_search_results

slk_helpers total_number_search_results#

print number of results found by provided search

$ slk_helpers total_number_search_results [-q|--quiet] <SEARCH_ID>
  • -q / --quiet: print no warnings

Prints out the total number of search results. All search results independent of user permissions are counted. E.g. if a search found 10 results but the current user can only see 1 result, then slk list will print out this 1 result whereas slk_helpers tnsr will print 10.


Tape Stati#

  • AVAILABLE: tape is fully available

  • BLOCKED: currently data is written onto the tape; recalls/retrievals targeting this tape will fail until the write process is finished; please wait a few hours

  • ERRORSTATE: tape is in an error state which needs to be reset; currently, no recall/retrieval from this tape is possible; please contact

  • UNAVAILABLE: only used in group_files_by_tape for files without storage information; no recall/retrieval possible

  • UNCLEAR: only used in group_files_by_tape for files stored on multiple tapes each; status of these tapes was not checked

Job Stati#

  • BLOCKED: job is blocked by another running job (please retry later; e.g. 60 min)

  • QUEUED: job is queued in StrongLink

  • PROCESSING: job is being processed (= files are read from tape)

  • PAUSED: job has been paused by a StrongLink admin; there is an issue with your job; please contact (data protection: StrongLink admins cannot view the job owner)

  • COMPLETED: job has been completed; was replaced by SUCCESSFUL and FAILED; might be returned in rare situations

  • SUCCESSFUL: job has been completed and was successful

  • FAILED: job has been completed and was not successful

  • ABORTED: job has been aborted by a StrongLink admin; there has been an issue with your job; please contact (data protection: StrongLink admins cannot view the job owner)

  • STOPPED: job has been stopped (very rare)

  • WAITING: job is waiting for something (very rare)

  • OTHER: other not clearly defined state (very rare)

Exit codes#



exit code

bad input command

always (redirected to slk help)



(not help, version and session)

(not help, version and session)

session expired


issue related to config file


conntection timeout or connection could not be established






resource exists and has checksum


resource not found


requested checksum not available


resource path and resource ID provided


any other error except connection issue



resource exists


resource does not exist


any other error except connection issue



any error except connection issue



query successfully generated


any other error except connection issue



query successfully generated


a field name or a schema name does not exist or a value cannot be converted


any other error except connection issue


gfbt (same as group_files_by_tape)

files successfully grouped


any error except connection issue



files successfully grouped


any error except connection issue



hostname is set and is as printed


any error except connection issue



metadata exported successfully


any error except connection issue



resource exists and is cached


resources exist and all of them are cached


resource exists and is not cached


resources exist and at least one is not cached


resource(s) do(es) not exist


resource path and resource ID provided


resource path and search ID provided


resource resource ID and resource ID provided


any other error except connection issue



login token exists and belongs an admin user


login token exists but belongs a normal user


no login token


session expired


any error except connection issue



resource exists and is on tape


resources exist and all of them are on tape


resource exists and is not on tape


resources exist and at least one is not on tape


resource(s) do(es) not exist


resource path and resource ID provided


resource path and search ID provided


resource resource ID and resource ID provided


any other error except connection issue



metadata imported successfully


any error except connection issue



job exists


job does not exist


any error except connection issue



number of jobs printed successfully


any error except connection issue



status of the job printed successfully


job has failed or was aborted


any error except connection issue



no file is flagged as “partial file”


at least one file is flagged as “partial file”


any error except connection issue



resounrce exists


resource does not exist


any error except connection issue



search id correct and search results to print


search id correct but no results to print


search id does not exist


any error except connection issue



search id correct and search results to print


search id correct but no results to print


search id does not exist


any error except connection issue



resource exists and metadata available


resource does not exist


any error except connection issue



namespace successfully created


namespace with same name already exists


any other error except connection issue



all targeted files could be touched


any other error except connection issue



sizes and checksums of all file parts printed


file has 0 byte and no storage info


one or more checksums not available


file > 0 byte but has no storage info


resource exists but is a namespace (folder)


resource does not exist


invalid combination of input parameters


any other error except connection issue



recall job started successfully


at least one targeted tape is unavailable


no resources to recall (e.g. all are cached)


any error except connection issue



recall job started successfully


at least one targeted tape is unavailable


no resources to recall (e.g. all are cached)


any error except connection issue



resource with given ID exists


resource with given ID does not exist


any error except connection issue



resource(s) with given path exist(s)


one or more resources do not exist


any error except connection issue



resource with given ID exists


resource with given ID does not exist


any error except connection issue



resource with given ID exists


resource with given ID or path does not exist


resource path and resource ID provided


any error except connection issue



resource with given ID exists and is a file


resource with given ID or path does not exist


resource is a namespace


resource path AND resource ID provided


any error except connection issue



resource with given ID exists


resource with given ID or path does not exist


resource path AND resource ID provided


any error except connection issue



job report successfully fetched and printed


job id does not exist, job not finished or job not-successfully finished


any error except connection issue



resources succe retrieved


one or more resources could not be retrieved because they were not cached


any error except connection issue (e.g. output file could not be written; un- expected retrieval error arose)



search successfully submitted


any error except connection issue



search is incomplete (== search still running)


search is complete (== search has finished)


search id does not exist


any error except connection issue



search successfully performed


any error except connection issue



search was successful


search failed or is still running (incomplete)


search id does not exist


any error except connection issue



search was successful


search failed or is still running (incomplete)


search id does not exist


any error except connection issue



search id exists


search id does not exist


any error except connection issue



login token exists and is not expired


no login token


session expired


any error except connection issue



resource exists (file or namespace)


resource with given ID or path does not exist


resource path and resource ID provided


any other error except connection issue



verify job successfully submitted


user reached allowed job limited of two jobs


wrong combination of input parameters


any other error except connection issue



tape exists


tape does not exist


any error except connection issue



tape exists


tape does not exist


tape barcode AND tape ID provided


any error except connection issue



tape exists


tape does not exist


any error except connection issue



tape exists


tape does not exist


tape barcode AND tape ID provided


any error except connection issue



tape is available for reading


tape is blocked / currently no reading


tape barcode AND tape ID provided


any error except connection issue


tnsr (same as total_number_search_results)

search was successful


search failed or is still running (incomplete)


search id does not exist


any error except connection issue



search was successful


search failed or is still running (incomplete)


search id does not exist


any error except connection issue



targeted resource (cached file) was touched


targeted resource is not cached


targeted resource is no file


any error except connection issue





Technical background of selected commands#

slk_helpers gen_file_query#

The search query is generated as follows: The input file list is taken and each path is separated into filename (like basename PATH) and directory (like dirname PATH). All filenames in the same directory are grouped and a regular expression is generated which finds exactly these files. Then this expression is linked via an and to the respective directory in which these files are located. This is done for each distinct directory in the input. These search expressions are linked via an or at the top level.

It is checked whether a directory exists in StrongLink. An error is thrown if it doesn’t exist.

The resulting search query can be optimized in length by the user. We do not do this in the slk_helpers because it would add considerable complexity to the code.

Major Changes#

1.13.3 (2025-02-20)#

  • fixed bug in slk_helpers gfbt --gen-search-query which printed debug output

1.13.2 (2025-01-31)#

  • new argument for slk_helpers retrieve: --run-as-slurm-job-with-account <ACCOUNT> which will generate a SLURM job script for the retrieval

  • removed argument -d` from ``gfbt / group_files_by_tape because a user might expect it to be the short version of --destination although it is the long version of --details

  • fixed issue related to verify and recall jobs: in some situations, wrong job names were generated or compared

  • fixed a bug which caused slk_helpers retrieve to be unable to retrieve a regular 0-byte file from the cache

  • fixed a bug were a regular 0-byte file was not recognized as being available for retrieval

1.13.1 (2024-12-10)#

  • commands recall and retrieve automatically remove duplicates in the input resource list; the order of the resources is not preserved

1.13.0 (2024-12-06)#

  • important changes:
    • gfbt / group_files_by_tape only evaluates regular expressions in the input if --regex or --evaluate-regex-in-input is set

    • gfbt / group_files_by_tape got many new features

    • new commands slk_helpers recall and retrieve are extended version of the slk recall and slk retrieve commands; they can be used much easier in automated workflows; slk_helpers retrieve starts no automatic recall but only retrieves files from the cache

    • new command resource_id does the same as exists but accepts multiple resource paths as input

    • list_clone_file accepts multiple resource paths as input

    • result_verify_job goes through all files checked by the verify job and does additional file check which the verify job does not

  • general bug fixes:
    • tapes which are not available anymore are ignored by most commands

    • removed debugging comments from previous versions that were forgotten

  • new commands:
    • recall

    • recall_needed: checks whether a recall needs to be performed or not (does not check whether a recall is possible or not)

    • resource_id: like exists but accepts multiple resource paths as input and has different output format

    • retrieve

  • command checksum: fixed exit code

  • command job_report and result_verify_job
    • ignored non-existing files, in the past; now, they print them

    • StrongLink might shorten the path of files like /arch/blub/ to /arch/blub~/; for each non-existing file in the output of these commands we check this case, now

    • result_verify_job identifies additional problematic files which are not recognized by verify jobs

  • command group_files_by_tape / gfbt, new arguments:
    • --resource-ids: expect resource ids as input

    • -dst <dst> / --destinationPath <dst>: ignore files which exist already in dst

    • -ns: preserve original namespace in destinationPath

    • -wf1 <dst> - --retrieval-workflow-1 <dst>: “workflow 1” => shortcut for --details --count-tapes -ns --write-resource-id --destinationPath <dst>

    • -wrid / --write-resource-id: write resource ids per tape to text files with names files_tape_<tape barcode>.txt; further created files are files_all.txt (all resource ids), tapes.txt (all tapes for which the first type of file are created) and (parameters for watcher scripts); possibly, the files files_multipleTape.txt, files_notStored, files_ignored and/or files_cached are created

    • -ao / --append-output: when -wrid is set and target files already exist, append output to them (error otherwise)

    • -oo / --overwrite-output: when -wrid is set and target files already exist, append output to them (error otherwise)

    • regular expressions in the input are only evaluated when --regex / --evaluate-regex-in-input is set

  • command list_clone_search: can print resource ids instead of resource paths (--print-resource-ids)

  • command list_clone_file
    • can print a fifth timestamp when --print-more-timestamps is set

    • accepts multiple resource paths as input

    • can print resource ids instead of resource paths (--print-resource-ids) in the right most column

    • can read resource paths from a file --read-from-file <file>

    • can read resource paths from stdin (on empty input)

  • command recall:
    • starts recall job for provided resources and instantly returns StrongLink recall job id

    • accepts a list of resource paths or resource ids or one search id

    • resources or search id can be piped into the command (e.g.: cat file_list.txt | slk_helpers recall ...)

    • if -d/--destionation <dst> is set, only files not present in dst are recalled (files compared based on size and mtime)

  • command resource_id:
    • works like exists but

    • accepts multiple resource paths as input (provided in the command call, via stdin or via a file --read-from-file <file>)

    • prints <resource path>: <resource id> or <resource path>: not exists or <resource path>: problem accessing resource (+ throws error)

  • command resource_tape got parameter --print-tape-barcode-only

  • command retrieve:
    • starts retrieval of provided resources to the dst provided by -d/--destionation <dst>

    • accepts a list of resource paths or resource ids or one search id

    • resources or search id can be piped into the command (e.g.: cat file_list.txt | slk_helpers retrieve ...

    • if -vv is set, detailed output per file is printed

    • files not stored in the cache are not retrieved and no recall is started for them

    • a file listing all resources which could not be retrieved can be returned via write-missing-to-file <output_file>

  • command size exits with an error if a file has an internal size mismatch which is not visible to the user (affected 15 files of 2 x 10^7 files; can only occur when the same file is archived multiple times in parallel to the same location)

1.12.10 (2024-04-12)#

  • total_number_search_results / tnsr` returns exit code of 1 when search is not finished or when search failed

  • tape_id returned the tape library name instead of the tape id (fixed; bug present since version 1.12.7)

1.12.9 (2024-04-09)#

  • fixed a bug which caused tape_status and group_files_by_tape / gfbt to exit with an error when a tape was blocked by a system-job

1.12.8 (2024-04-02)#

  • added a column containing the tape id to the output of list_clone_file and list_clone_search (column 9; path is now in column 10)

1.12.7 (2024-03-28)#

  • new hidden command tape_library (slk_helpers tape_library (<tape_id>|--tape-barcode <tape_barcode>))

1.12.6 (2024-03-22)#

  • improved information in error messages (related to resources and jobs)

  • updated estimation of verify jobs to submit in submit_verify_job

1.12.5 (2024-03-17)#

  • fixed command tnsr

1.12.4 (2024-03-17)#

  • removed false warnings from size command

  • new short version tnsr for the command total_number_search_results

1.12.3 (2024-03-14)#

  • new hidden command: list_clone_file

  • hidden command list_clone renamed to list_clone_search

  • fixed parameter --count in list_clone / list_clone_search

  • adapted value of source in json output of submit_verify_job

1.12.2 (2024-03-11)#

  • submit_verify_job: new parameters --results-per-page was hidden

1.12.1 (2024-03-11)#

  • reordered code for writing user information on executed command to the log

  • job_report: fixed few errors

  • print_rcrs: fixed output of resource id in error message

  • result_verify_job:
    • improved error output

    • does not throw error anymore but warning when numbers of submitted and checked files do not agree

    • throws no error anymore when target namespace does not exist

  • search_status also returns exit code 1:
    • exit code 0: search ended successfully

    • exit code 1: running or failed (incl. search timeout)

    • exit code 2: any error returned by StrongLink which is not captured by 1 or 3

    • exit code 3: timeout while communicating with StrongLink

  • size: fixed warnings

  • submit_verify_job
    • has new parameters --end-on-page and --results-per-page

    • updated restart information when command timeouts

    • modifications and improvements in internal structure

1.12.0 (2024-02-22)#

  • new parameter --json for commands print_rcrs, submit_verify_job and result_verify_job

  • fixed output of resource_tape when target file was not stored on a tape yet

  • list_clone (now list_clone_search): width of uid and gid columns increased

  • changes in submit_verify_job:
    • parameter --resume-on-page is not hidden anymore

    • fixed output of parameter --resume-on-page

    • improved error messages

  • changes in result_verify_job:
    • improved error messages

    • remove resources which do not exist anymore from report

    • captures a StrongLink bug when a file had the same name as its parent namespace

    • new parameter --json

  • minor fix in generation of search query via gen_search_query

  • new hidden command search_status

  • basic logging into ~/.slk/slk-cli.log

  • commands can be disabled in /etc/stronglink.conf by adding a list with disabled commands via key disabled_commands: "disabled_commands":["cmd1", "cmd2", ...]

1.11.2 (2024-01-15)#

  • new hidden command list_clone (later renamed to list_clone_search):
    • prints search results

    • similar list_search

    • prints four time stamps (in this order):
      • mtime of the file (time stamp of “last modification of the file prior to the archival”)

      • time stamp of first file version in StrongLink

      • stamp of current file version in StrongLink

      • time stamp of last StrongLink-internal copy process of this file (e.g. last recall)

  • various small code updates: better error handling; capture problematic situations; fixed typos

  • improved commands which are based on searches (group_files_by_tape, total_number_search_results, submit_verify_job, …):
    • search timeouts are properly captured

    • unknown/unexpected search stati are recognized

    • better error handling when search does not finish or is not successful

  • print_rcrs updates:
    • prints tape id and barcode

    • got the parameter --json to print output as JSON

1.11.1 (2023-12-20)#

  • new command resource_tape which prints on which tape(s) a resource is stored on; can be used if gfbt / group_files_by_tape does not work due to high system load of StrongLink

  • fixed a bug in gfbt / group_files_by_tape which caused that a wrong tape id / barcode was used when a file was migrated from an old HPSS tape to a new tape

1.11.0 (2023-12-08)#

  • If files are in an unclear caching state, commands like is iscached and is_on_tape will not exit with an error but throw a warning.

  • If files are in an unclear caching state, iscached will inform the user that files in unclear caching state exist and exit with an error. If -v or -vv is set, the files in unclear caching state will be listed.


unclear caching state should only occur while a file is copied from tape to HSM-cache or when there is a connection issue between StrongLink and HSM-Cache. Please contact when this happens.

1.10.2 (2023-11-29)#

  • updated verbose messages for size and result_verify_job commands

  • updated help text of size

  • removed debugging output from submit_verify_job

1.10.1 (2023-11-20)#

  • updated verbose messages for size command

1.10.0 (2023-11-17)#

  • better handling of connection timeouts with StrongLink

  • updated submit_verify_job:
    • updated output when the command does not submit any verify job

    • added a parameter --resume-on-page <n> option to simplify resuming the command in times of many connection losses to StrongLink

    • added a parameter --save-mode to start verify jobs for only 1000 files or less in order to simplify restarting the command in times of many connection losses

  • new command result_verify_job:
    • list relevant errors of verify job (default; no special arguments)

    • list checked files (--soures)

    • get part of the header of the verify report (--header)

    • list number of errors and checked files (--number-errors and --number-sources)

  • extended command size by new parameters:
    • -R / --recursive for requesting the size of the content of folders recursively

    • --pad-spaces-left for space padding to the left in order to align file/namespaces sizes when the command is called multiple times

1.9.10 (2023-10-23)#

  • add --quiet / -q to command total_number_search_results (hidden)

1.9.9 (2023-10-17)#

  • access constraints for requesting job information; non-admin users may only access:
    • VERIFY jobs of the current user or

    • COPY jobs, which do retrievals/recalls and were started by slk

  • commands submit_verify_job_files and submit_verify_job_namespace from previous release only allowed for admin users

  • new commands:
    • submit_verify_job: run a verify job for a provided set of files

    • is_admin_session: Check if the use is currently logged in as normal user or admin user

    • search_incomplete: Prints whether the search is incomplete (still running)

    • search_successful: Prints whether the search was successful

    • search_immediately: Creates search and returns search id immediately, even if search is not finished (hidden; only for specific user cases)

1.9.8 (2023-10-05)#

  • new commands (hidden because not final versions):
    • job_report: print a job report; e.g. of a verify job

    • submit_verify_job_files: submit a verify-job for a list of files (as paths) or of resource ids

    • submit_verify_job_namespace: submit a verify-job for a namespace (as path) or a resource/namespace id

    • print_rcrs: (print size and checksums of file parts; some HPSS files are stored as two parts on two tapes

  • catch all HTTP status code >= 400 everytime an HTTP request is send


1.9.7 (2023-08-16)#

  • mkdir has a new argument -p / --parent which is similar to -R but throws no error when target exists and is a namespace/folder; thus, it behaves like the Linux mkdir -p in the terminal

  • changed error message which mkdir prints when it receives a path to a file as target

1.9.6 (2023-08-02)#

  • gfbt / group_files_by_tape has new parameter --print-resource-id which will print resources IDs instead of file paths

1.9.5 (2023-07-14)#

  • bug fixes regarding some old HPSS files used in gfbt

1.9.4 (2023-07-13)#

  • bug fixes in command gen_search_query:
    • modified description of command

    • added new field tape_barcode

    • field smart_pool internally was compared against tape_barcode

    • value of field smart_pool is now checked against list of existing Smart Pools

1.9.3 (2023-07-05)#

  • fixed conversion of dates to seconds since 1970 instead of milliseconds since 1970; relevant for gen_search_query

  • gen_search_query now also understands the operators <, >, <= and >=

  • removed unnecessarily created instances of ObjectMapper

1.9.2 (2023-06-30)#

  • new command gen_search_query: * generates a JSON search query which can be run by slk search * accepts search conditions/fields like netcdf.Project='abc', resources.birth_time='2023-01-01T13:00:00' or path=/arch/bm0146/k204221 * search conditions/fields are linked via and * user can provide another existing search query via --search--query which is linked via and to the other input* new command gen_search_query

1.9.1 (2023-06-16)#

  • search_limited: did not recognize certain failed searches in the past

  • iscached: * no error thrown anymore when resource ids are inserted which represent namespaces * correct output is printed when only one file was provided and -v or -vv is set * fixed issue when checking caching status of 0 byte files

  • job_status: * new job stati FAILED and SUCCESSFUL replace old status COMPLETED; COMPLETED might still be used * returns exit code 1 if a job has status FAILED, ABORTED or ABORTING

  • new command is_on_tape: * same as iscached but checks if files are on tape * files, which are on tape and in the cache, are considered as being on tape * this is NOT the inverse of iscached => a file can be on tape and in the cache

  • optional verbose and summary output is now printed to stderr instead of stdout * hsm2json: verbose output printed to stderr (but summary not) * gfbt / group_files_by_tape: print verbose output (diagnostic purpose) to stderr * search_limited: search status

  • these commands can handle searches of which one or more resources were deleted * iscached * is_on_tape * has_no_flag_partial * hsm2json * gfbt / group_files_by_tape

  • list_search may list already deleted files (not checked for performance reasons)

1.9.0 (2023-05-16)#

  • changed exit codes to 3 when a timeout error is thrown or a connection cannot be established

  • command has_flag_partial has been renamed to has_no_flag_partial

  • has_no_flag_partial behaves the same like iscached

  • iscached * prints list of not-cached files when -v is set (now: negativ-list + summary; past: summary) * prints a summary in the end when -vv is set (now: full file list + summary; past: full file list)

  • job_queue * new argument --format which has the same meaning as -i / --interpret * new output format JSON / J

1.8.10 (2023-05-08)#

  • new command has_flag_partial to check whether a file is flagged as partial (incomplete) file


has_flag_partial was renamed to has_no_flag_partial in slk_helpers 1.9.0.

1.8.9 (2023-05-02)#

  • fixed an error in tape_status which was thrown when no barcode was provided

  • command job_queue` has new optional argument ``-i <INTERPRET_TYPE> / --interpret <INTERPRET_TYPE> with these values for INTERPRET_TYPE (case insensitive): * RAW/R: same as argument not set * TEXT/T: print short textual interpretation of the queue status => none, short, medium, long, jammed * DETAILS/D: print detailed textual interpretation of the queue status * NUMERIC/N: print a number representing the queue status => 0 (==none), 1 (==short), …, 4 (==jammed)

1.8.8 (2023-04-25)#

  • new command searchid_exists

  • iscached now accepts a directory/namespace as input (with -R set)

1.8.7 (2023-04-13)#

  • changed exit code of checksum when a file is stored on more than one tape from 1 to 2

  • editorial changes in the changelog

1.8.6 (2023-04-06)#

  • minor bugfixes in the error output messages

  • properly exit when wrong parameters are provided (in some situations)

1.8.5 (2023-04-06)#

  • new flag --help to print the help for a specific command; e.g. slk_helpers --help mkdir will print the help for mkdir

  • new hidden flag --pid will print the Linux process id of the Java virtual machine

  • group_files_by_tape has new flags: * --set-max-tape-number-per-search <N> / --smtnps <N> which causes the searches to be run not for one tape but for a maximum of N tapes – only if less than 50 files are to be retrieved per search * -v (same as --print-progress) and -vv for verbose and double-verbose output, respectively

  • no command help is printed when a command is used the wrong way

  • fixed: commands which expect a list of Strings did not recognize wrong parameters but interpreted them as items of the list

1.8.4 (2023-04-05)#

  • iscached and size also accept resource ids (via flag --resource-id) in addition to resource paths

  • iscached: * also accepts search ids (via flag --search-id) in addition to resource paths and resource ids * got the flags -v and -vv for verbose and double verbose mode, respectively

  • resource_type and resource_permissions expect a resource path by default xor a resource id via --resource-id

  • interal changes related to the new class Resource

  • group_files_by_tape * fixed when a file has no storage information * new parameter --search-query '<search_query>' * internal searches are performed differently which partly more efficient

  • gen_file_query: * new parameters --cached-only, --not-cached (currently not working) and --tape-barcodes TAPE1,TAPE2,... * bugs in the JSON output were fixed * properly deal with files without storage information * width of status in normal text output (value in brackets) increased by one

1.8.3 (2023-03-23)#

  • iscached properly prints the cache

  • changes in the code base: new classes Resource and Checksums

  • group_files_by_tape / gfbt has flags --json and --json-pretty

  • checksum did not work after update to 1.8.1

1.8.2 (2023-03-21)#

  • job_status`: fixed a certain job status which caused job_status to fail

1.8.1 (2023-03-20)#

  • a file might be split into multiple parts, which are stored on separate tapes; this was not captured properly by the following commands and is fixed now: * checksum: prints an error when a file is split because no checksum are available for the overall file (only for the file parts) * gfbt: properly identifies files stored on multiple tapes

1.8.0 (2023-03-14)#

  • new commands: * resource_permissions: print permissions of a resource * resource_type: print type of a resource (‘namespace’ or ‘file’) * resource_path: same as resourcepath * tape_barcode: get tape barcode from tape id (barcode needed for search queries) * tape_id: get tape id from barcode

  • new arguments * --tape-barcode is new for tape_exists and tape_status * --print-tape-barcode, -c/--count-files and --print-progress are new for group_files_by_tape / gfbt

  • search_limited has be deprecated; please use slk search

  • tests for tape_id, tape_barcode, tape_exists, resource_type, resource_permissions

1.7.6 (2023-03-07)#

  • minor bugfixes in the output of the command metadata

1.7.5 (2023-03-01)#

  • extended the command json2hsm by the argument -j/--json-string JSON_STRING which allows to pass a JSON string directly to the command instead of writing it into a file. If a filename is provided in addition, an error is thrown.

  • the commands hsm2json and metadata have a new argument --print-hidden; they do not print the field netcdf.Data by default (and other sidecar data); these data are printed when the new argument --print-hidden is set

1.7.4 (2023-02-10)#

  • removed , in the output of group_files_by_tape

  • improved conversion of dates in hsm2json and json2hsm

  • hsm2json exports dates according to ISO 8601

  • change JSON metadata standard from 2.1.0 to 2.1.1
    • added JSON metadata key mime_type

1.7.3 (2023-02-06)#

  • added one missing internally used job status (PAUSING)

  • change JSON metadata standard from 2.0.0 to 2.1.0
    • added JSON metadata key protocol

    • improved usage of JSON metadata key location

  • changed code structure

  • restructed code file for metadata

1.7.2 (2023-02-01)#

  • added one missing internally used job status (ABORTING)


  • fixed new tape status ERRORSTATE


  • new commands: job_exists, job_status, job_queue

  • group_files_by_tape (and tape_status):
    • modified structure of the output

    • new tape status ERRORSTATE when the tape is in a bad state which needs intervention from the support

  • increased timeout for the time to establish a connection

  • minor restructuring of the code



  • fixed errors related to processing of JSON returned by StrongLink

  • restructured code


changes from 1.4.0 to 1.5.7

  • json2hsm / import_metadata: * renamed command import_metadata to json2hsm * removed parameter --update-only-one-resource * --write-mode got new option CLEAN which cleans all metadata of the selected resource before setting the new metadata (clean == removes content). * print JSON formatted summaries when --print-json-summary is set

  • hsm2json / export_metadata: * renamed command export_metadata to hsm2json * hsm2json prints an export summary when new parameter --print-summary is set * hsm2json print JSON formatted summaries when --print-json-summary is set * do not print metadata as pretty but as compact JSON when --write-compact-json is set

  • slk_helpers list_search: print search results continuously (in contrast to collecting all search results, first, before printing them altogether as slk list does it)

  • updated tests


  • removed import_metadata_recursive

  • merge the three other import_metadata_* commands to import_metadata: * --update-only-one-resource PATH_OD_ID => like import_metadata_one_file * --use-res-id => like import_metadata_use_res_id * none of the previous flags => like import_metadata_use_abs_path

  • JSON structure of metadata was incremented from v1.0.0 to v2.0.0; v2.0.0 is equal to the output of slk tag -display RESOURCE

  • remove -Q/--fully-quiet flag (fully quiet; suppress error messages)

  • readme updated


  • new commands:
    • export_metadata

    • import_metadata_one_file

    • import_metadata_recursive

    • import_metadata_use_abs_path (hidden; meant for expert users)

    • import_metadata_use_res_id (hidden; meant for expert users)

  • new flags / arguments
    • slk_helpers metadata now has --alternative-output-format

  • slk_helpers gen_file_query a file list in a string which is separated by newlines

  • minor bug fixes


  • new commands
    • gen_file_query: create a query string to search files, which are provided as input

    • list_search: list search results (incl. path of resources)

    • updated exit codes


  • new commands
    • iscached: prints out whether a file is cached (== quick access) or not

    • search_limited: like slk search put works only for searches that 1000 results or less)

    • version: prints the version of slk_helpers