improved retrieval workflow v01#

file version: 25 June 2025

current software versions: slk_helpers version 1.16.5; slk wrappers 2.4.0

quick start#

Please create a new directory and change into it. There will be many new text files created by the next commands.

module load slk
slk_helpers init_watchers <source files> -d <local destinationPath> -ns
# same as: slk_helpers gfbt <source files> -wf1 <local destinationPath>
slk_helpers start_watchers <DKRZ project slurmJobAccount>
# same as:
#   start_recall_watcher.sh <DKRZ project slurmJobAccount>
#   start_retrieve_watcher.sh <DKRZ project slurmJobAccount>
# ... some time goes by and we decide to manually stop the watchers
#         which is not necessary ...
slk_helpers stop_watchers

Check the retrieve.log and recall.log files. Check the tapes_error.txt and files_error.txt files and report issues to support@dkrz.de

When the recall and retrieve watchers stop/die/are-aborted, then you can resume the whole process by starting the start_*.sh scripts again. The init_watchers command should not be run again under normal conditions. If you decide to run the init_watchers command again, please clean up the working folder, first, or create a new folder to run the whole command-script-chain there.

A detailed example is given in the end.

new / extended commands#

A few new commands are provide:

new command:
- slk_helpers retrieve
- slk_helpers recall
- slk_helpers init_watchers (similar to gfbt but less parameters)
- slk_helpers start_watchers
- slk_helpers stop_watchers
extended existing command:
- slk_helpers group_files_by_tape / slk_helpers gfbt

The slk_helpers recall and slk_helpers retrieve are meant as replacement for the slk recall and slk retrieve commands.

slk_helpers recall starts a job within the StrongLink system which reads files from tape into the cache. After submitting the job, the command prints the job id and quits – in contrast to slk recall which prints the job id to the slk log file (~/.slk/slk-cli.log) and idles until the recall job is finshed.

The slk_helpers retrieve command copies only files from the cache to the user and starts no recall job in StrongLink. The latter is done by slk retrieve. If a file should be retrieved from tape to the Lustre filesystem, the user has to run slk_helper recall, first, and, afterwards, slk_helpers retrieve. On the first look, this might make retrievals more complicated in contrast to using the old slk retrieve. However, as will explained below, these new commands can be used much easier in automated workflows.

We recommend using these commands when you need a hand full of files to be retrieved. For the retrieval of more files, new SLURM jobs scripts are provided which are based on the new commands and simplify the retrieval workflow for you: recall and retrieval watcher. Details are described in the next section.

The improved version of the command gfbt optionally generates certain files in a local directory which are quite useful for automated retrieval workflows using the two new slk_helpers commands from above. These files are required by the new recall and retrieval watcher scripts. Since the number of parameters became overwhelming, a new simplified version of gfbt named init_watchers was introduced which is only meant for the purpose of generating input files for watchers.

The recall and retrieve watcher scripts can be started manually or by slk_helpers start_watchers. The watchers end automatically, when their tasks are fulfilled. Alternatively, you can stop them by killing their SLURM jobs or by running slk_helpers stop_watchers which essentially does the same.

new scripts#

four new scripts are provided for recalling and retrieving data. Two scripts are the actual work horses:

recall_watcher.sh
retrieve_watcher.sh

and two scripts just start these work horses:

start_recall_watcher.sh
start_retrieve_watcher.sh

You can start these two scripts manually or let slk_helpers start_watchers one or both start*sh scripts for you.

Both start_* scripts check some preconditions and, then, submit their respective work-horse-script as a SLURM job. The work-horse-scripts will submit themselves again delayed when they finish. They write logging information into log files recall.log and retrieve.log and their SLURM logs are in a subfolder logs. The two scripts operate on file lists and require a config file which have to be stored in the directory in which they are started. These file lists and config file (config.sh) are generated by slk_helpers init_watchers. Hence, the latter commad has to be run once before the watcher scripts can be started.

recall and retrieve watchers explained#

Note

This section is meant for power users who wish to adapter or rebuild the recall and retrieve watchers for their purpose.

The general idea of the new workflow is that one recall is run per tape but only one retrieval is run for all files. Both processes are separated into to self-resubmitting SLURM jobs:

start and monitor recall jobs
retrieve all requested files which are already in the cache and not in the destination location

The details are given below.

Why one recall per tape?#

First, this makes the access of each tape most efficient because each tape has to be accessed only once.

Second, this does not overload the StrongLink system with multiple requests for one tape or multiple requests targeting too many tapes are ones. Details on why this is bad are given here: https://docs.dkrz.de/doc/datastorage/hsm/retrievals.html#aggregate-file-retrievals (TODO).

Why cannot we do this with the existing slk commands?#

Keys points:

slk recall and slk retrieve do not accept file list but only a single file, a namespace or a search id. File lists needed to be provided via searches / search ids in the past which might be slow when long file lists should be provided. slk_helpers recall and slk_helpers retrieve accept a plain file list and a user can also pipe files into the commands.
slk_helpers recall and slk_helpers retrieve accept resource ids as input in addition to resource paths which considerably shortens calls of these commands.
slk recall and slk retrieve are made for interactive use. Non-interactive use in scripts is very complicated and error-prone.
slk retrieve automatically starts recalls for files which are not in the cache. This is practical, when only one file is needed but might cause issues when a large dataset is requested. It does not allow to “get everything what we need and what is currently in the cache”.
slk_helpers recall and slk_helpers retrieve offer parameters to tell these commands that they only should get files back which are not already existing in a user-provided destinationPath folder.
StrongLink has an internal queue for recall jobs. slk recall submits a job, prints the job id to the slk log (~/.slk/slk-cli.log) and waits until the recall job is finished. This is very inefficient because it needs e.g. a SLURM job to run all the time a recall job runs. The same states for slk retrieve. Instead, the new slk_helpers recall works like sbatch: submits a recall job, prints the job id to the terminal quits. A user (or script) can then check via slk_helpers job_status whether the job is still running or not.

How does the new workflow look like#

First: DKRZ provides two scripts which do most of the steps described below automatically. A user only needs to run ``slk_helpers gfbt`` and, then, start the two starter scripts.

The users runs first slk_helpers init_watchers:

slk_helpers init_watchers <source files> -d <local destinationPath> -ns

The command creates multiple files in the current directory.

config.sh: various environment variables used for configuring the recall and retrieve watcher scripts
files_all.txt: list of all files to be retrieved; used by retrieve_watcher.sh
files_cached.txt: list of all cached files; not explicitely used
files_multipleTapes.txt: list of all HPSS files split amongst two tapes; require special treatment; used by recall_watcher.sh after all tapes in tapes.txt have been processed
files_notStored.txt: list of files which exist as metadata entries but do not have a real file attached; there were a few files archived by HPSS for which this was the case
files_ignored.txt: list of files which are ignored for the retrieval because they exist already in the local destinationPath
files_tape_<TAPE_BARCODE>.txt: list of all files which should be recalled from the tape with the given TAPE_BARCODE; used by recall_watcher.sh
tapes.txt: list of all tapes from which data should be recall; used by recall_watcher.sh; a corresponding file files_tape_<TAPE_BARCODE>.txt has to exist for a given tape barcode in this list

Now, we need to run slk_helpers recall for each tape. First, we get a list of tapes:

$ cat tapes.txt
<tapeBarcode1>
<tapeBarcode2>
<tapeBarcode3>
<tapeBarcode4>
...

Second, we should check whether the tapes are actually avaiable:

$ slk_helpers tape_status <tapeBarcode1>
AVAILABLE

$ slk_helpers tape_status <tapeBarcode2>
BLOCKED

$ slk_helpers tape_status <tapeBarcode3>
ERRORSTATE

$ slk_helpers tape_status <tapeBarcode4>
AVAILABLE

Data can be requested from AVAILABLE tapes.

BLOCKED tapes are currently used by other jobs. In order to prevent StrongLink from becoming slow, the new slk_helpers recall allows only one active job per tape (see for details: https://docs.dkrz.de/doc/datastorage/hsm/retrievals.html#aggregate-file-retrievals). Recall jobs submitted to this tape will fail.

Please inform us when one of your tapes is in an ERRORSTATE. Commonly, this points to a set warning flag in the metadata or to an inconsistency in the tape metadata. Some of these error states can be solved by the StrongLink admins at DKRZ and others have to be reset by the StrongLink support. Although DKRZ staff resets such tape states regularly or requests it with the StrongLink supprt, some tapes might be overlooked because StrongLink does not allow to search for all tapes in such a state.

Third, we actually run slk_helpers recall for AVAILABLE tapes:

$ cat files_tape_<tapeBarcode1>.txt | slk_helpers recall --resource-ids
<jobIdA>

$ cat files_tape_<tapeBarcode4>.txt | slk_helpers recall --resource-ids
<jobIdB>

...

There should not be running more than four recalls at once. For certain tape types, only two should run in parallel because the available number of tape drives is very low. A user could check the running state of a job via slk_helpers job_status:

$ slk_helpers job_status <jobIdA>
SUCCESSFUL

$ slk_helpers job_status <jobIdB>
PROCESSING

When a job is FAILED, it should be resubmitted. If it fails multiple times, the DKRZ Beratung should be contacted.

In parallel to running recall processes, the retrieval can run:

cat files_att.txt | slk_helpers retrieve --resource-ids -ns --destination <local destinationPath> -vv

Since -vv is set, double verbose output will be printed. This command will retrieve all files which are already in the cache and not available in the local destinationPath. We recommand setting -ns which will reconstruct the full archival path in local destinationPath. This flag is recommanded. It is implicitely assumed by slk_helpers init_watchers.

This command can be run repeatedly until all requested files have been retrieved. The command returns exit code of 2 if a general error occurs and exit code 3 if a timeout occurs. As long as at least one file is not cached yet, the commdn returns exit code 1. This makes the command easily be used in a script / automated workflow:

exit code 0: all files successfully retrieve or already in local destinationPath
exit code 1: re-run slk_helpers retrieve
exit code 2: stop retrieving and check the error message
exit code 3: wait a bit and submit slk_helpers retrieve again.

usage example#

We want to get CCLM forcing made from ERA5 data for the years 1973, 1974 and 1975. We are in project ab1234 and to to retrieve the data to /work/ab1234/forcing. Please create a new directory and change into it. There will be many new text files created by the next commands.

Zero, load the appropriate slk module:

module load slk

First, we run the init_watchers command for the years 1974 and 1975 only because we forgot that we also need 1974:

$ slk_helpers init_watchers -R /arch/pd1309/forcings/reanalyses/ERA5/year1974 /arch/pd1309/forcings/reanalyses/ERA5/year1975 -d /work/ab1234/forcing -ns
# command line output is given further low for the interested reader

The output shows a a nice summary from which tapes how many files need to be recalled. We realized that 1973 is missing and simply let init_watchers append the information to the generated files (--append-output):

$ slk_helpers init_watchers -R /arch/pd1309/forcings/reanalyses/ERA5/year1973 -d /work/ab1234/forcing -ns --append-output
# command line output is given further low for the interested reader

Please notify us when you see tapes in ERRORSTATE.

Now, there should be multiple new files in the current directory. Please remain in this directory and proceed. The output to the terminal might differ slightly between these examples and the real command on Levante because we are still improving the textual output of our new commands and do not update the documentation each time.

Next, we submit the watcher scripts

$ slk_helpers start_watchers ab1234
successfully submitted recall watcher job with SLURM job id '1234567'
successfully submitted retrieve watcher job with SLURM job id '1234568'

If we wish to submit only one of the watchers, we can do so like only the recall watcher:

$ slk_helpers start_watchers --only-recall-watcher ab1234
successfully submitted recall watcher job with SLURM job id '1234567'

Alternatively, you can start the watchers manually:

$ start_recall_watcher.sh ab1234
successfully submitted recall watcher job with SLURM job id '1234567'

$ start_retrieve_watcher.sh ab1234
successfully submitted retrieve watcher job with SLURM job id '1234568'

Please check the retrieve.log and recall.log files to view the progress of the recalls/retievals and possible issues. The files files_error.txt and tape_error.txt contain only problematic files and their tapes, respectively. Commonly, tapes in failed states are manually reset a few times a day. If your recalls are urgent and you experiance tape read issues, please send us the tape list tape_error.txt and the recall log file recall.log to support@dkrz.de .

Job IDs of the current and the next SLURM jobs are stored in the files slurm_job_ids_recalls.txt and slurm_job_ids_retrievals.txt. You can cancel a watcher manually by running scancel for both job ids. Please clean up the respective lock files afterwards: .recall_watcher.lock and/or .retrieve_watcher.lock. Otherwise, the watchers cannot be restarted. It is more simple to stop the watchers with slk_helpers stop_watchers which cleans up the lock files automatically.

Thats it!

Command line output first init_watchers command:

$ slk_helpers init_watchers -R /arch/pd1309/forcings/reanalyses/ERA5/year1974 /arch/pd1309/forcings/reanalyses/ERA5/year1975 -d /work/ab1234/forcing -ns
progress: generating file grouping based on search id 826348 in preparation
progress: generating file grouping based on search id 826348 (for up to 190 files) started
collection storage information for search id 826348 started
Number of pages with up to 1000 resources per page to iterate: 1
collection storage information for search id 826348 finished
creating and returning object to host resource storage information
progress: generating file grouping based on search id 826348 (for up to 190 files) finished
progress: getting tape infos for 51 tapes started
progress: getting tape infos for 51 tapes finished
progress: extracting tape stati for 51 tapes started
progress: extracting tape stati for 51 tapes finished
------------------------------------------------------------------------------
progress: updating tape infos for 51 tapes started
progress: updating tape infos for 51 tapes finished
progress: extracting tape stati for 51 tapes started
progress: extracting tape stati for 51 tapes finished
------------------------------------------------------------------------------
  cached (AVAILABLE  ): 23
M24350M8 (BLOCKED    ): 2
M24365M8 (AVAILABLE  ): 3
M24366M8 (AVAILABLE  ): 2
M21306M8 (AVAILABLE  ): 2
M21307M8 (AVAILABLE  ): 1
M21314M8 (ERRORSTATE): 4
M21315M8 (AVAILABLE  ): 1
M24390M8 (AVAILABLE  ): 1
M24391M8 (AVAILABLE  ): 1
M24280M8 (AVAILABLE  ): 3
M21336M8 (AVAILABLE  ): 2
M21341M8 (AVAILABLE  ): 2
M21344M8 (AVAILABLE  ): 1
M21345M8 (AVAILABLE  ): 5
M21342M8 (BLOCKED    ): 8
M22372M8 (BLOCKED    ): 1
M21348M8 (BLOCKED    ): 3
M21349M8 (BLOCKED    ): 3
M21346M8 (AVAILABLE  ): 3
M21347M8 (AVAILABLE  ): 2
M21350M8 (AVAILABLE  ): 1
M24294M8 (AVAILABLE  ): 1
M24295M8 (AVAILABLE  ): 1
M24173M8 (AVAILABLE  ): 3
M22509M8 (AVAILABLE  ): 1
M21360M8 (AVAILABLE  ): 1
M21358M8 (AVAILABLE  ): 1
M21362M8 (AVAILABLE  ): 5
M21363M8 (AVAILABLE  ): 3
M32623M8 (AVAILABLE  ): 7
M21369M8 (AVAILABLE  ): 1
M32621M8 (AVAILABLE  ): 7
M32626M8 (AVAILABLE  ): 11
M32627M8 (AVAILABLE  ): 10
M22395M8 (AVAILABLE  ): 4
M32630M8 (ERRORSTATE): 3
M24320M8 (AVAILABLE  ): 1
M24321M8 (AVAILABLE  ): 3
M32631M8 (AVAILABLE  ): 7
M22655M8 (AVAILABLE  ): 3
M24324M8 (AVAILABLE  ): 1
M32635M8 (AVAILABLE  ): 10
M24325M8 (AVAILABLE  ): 1
M32632M8 (AVAILABLE  ): 8
M22659M8 (AVAILABLE  ): 3
M32638M8 (AVAILABLE  ): 8
M21385M8 (AVAILABLE  ): 1
M32636M8 (AVAILABLE  ): 4
M24202M8 (AVAILABLE  ): 1
M32640M8 (ERRORSTATE): 4
M32377M8 (AVAILABLE  ): 2
------------------------------------------------------------------------------

Command line output second init_watchers command:

progress: generating file grouping based on search id 826349 in preparation
progress: generating file grouping based on search id 826349 (for up to 95 files) started
collection storage information for search id 826349 started
Number of pages with up to 1000 resources per page to iterate: 1
collection storage information for search id 826349 finished
creating and returning object to host resource storage information
progress: generating file grouping based on search id 826349 (for up to 95 files) finished
progress: getting tape infos for 43 tapes started
progress: getting tape infos for 43 tapes finished
progress: extracting tape stati for 43 tapes started
progress: extracting tape stati for 43 tapes finished
------------------------------------------------------------------------------
progress: updating tape infos for 43 tapes started
progress: updating tape infos for 43 tapes finished
progress: extracting tape stati for 43 tapes started
progress: extracting tape stati for 43 tapes finished
------------------------------------------------------------------------------
M24277M8 (AVAILABLE  ): 2
M24339M8 (AVAILABLE  ): 1
M24280M8 (AVAILABLE  ): 3
M21336M8 (AVAILABLE  ): 1
M24278M8 (AVAILABLE  ): 5
M22422M8 (AVAILABLE  ): 1
M24279M8 (AVAILABLE  ): 1
M21340M8 (AVAILABLE  ): 1
M24221M8 (AVAILABLE  ): 1
M21345M8 (AVAILABLE  ): 2
M24350M8 (BLOCKED    ): 1
M21342M8 (BLOCKED    ): 2
M24351M8 (AVAILABLE  ): 1
M24223M8 (AVAILABLE  ): 1
M22372M8 (BLOCKED    ): 5
M21349M8 (AVAILABLE  ): 4
M21346M8 (AVAILABLE  ): 2
M21347M8 (AVAILABLE  ): 2
M21350M8 (AVAILABLE  ): 1
M24294M8 (AVAILABLE  ): 1
M32016M8 (AVAILABLE  ): 1
M24366M8 (AVAILABLE  ): 1
M21363M8 (AVAILABLE  ): 3
M32623M8 (AVAILABLE  ): 5
M21305M8 (AVAILABLE  ): 1
M32621M8 (AVAILABLE  ): 3
M32626M8 (AVAILABLE  ): 5
M32627M8 (AVAILABLE  ): 5
M24379M8 (AVAILABLE  ): 1
M22395M8 (AVAILABLE  ): 2
M32630M8 (ERRORSTATE): 1
M32631M8 (AVAILABLE  ): 5
M22655M8 (AVAILABLE  ): 3
M32635M8 (AVAILABLE  ): 6
M21314M8 (ERRORSTATE): 2
M32632M8 (AVAILABLE  ): 1
M32638M8 (AVAILABLE  ): 4
M24390M8 (AVAILABLE  ): 1
M32636M8 (AVAILABLE  ): 2
M24391M8 (AVAILABLE  ): 1
M21322M8 (AVAILABLE  ): 1
M32640M8 (ERRORSTATE): 2
M32119M8 (AVAILABLE  ): 1
------------------------------------------------------------------------------