improved retrieval workflow v01#
file version: 04 Feb 2025
current software versions: slk_helpers version 1.13.2; slk wrappers 2.0.1
quick start#
Please create a new directory and change into it. There will be many new text files created by the next commands.
module load slk/3.3.91_h1.13.2_w2.0.1
slk_helpers gfbt <source files> -wf1 <local destinationPath>
start_recall_watcher.sh <DKRZ project slurmJobAccount>
start_retrieve_watcher.sh <DKRZ project slurmJobAccount>
Check the retrieve.log
and recall.log
files. Check the tapes_error.txt
and files_error.txt
files and report issues to beratung@dkrz.de
When the recall and retrieve watchers stop/die/are-aborted, then you can resume the whole process by starting the start_*.sh
scripts again. The gfbt
command should not be run again under normal conditions. If you decide to run the gfbt
command again, please clean up the working folder, first, or create a new folder to run the whole command-script-chain there.
A detailed example is given in the end.
new / extended commands#
Two new commands are provided:
- new command:
slk_helpers retrieve
slk_helpers recall
- extended existing command:
slk_helpers group_files_by_tape
/slk_helpers gfbt
The slk_helpers retrieve
command copies only files from the cache to the user and the slk_helpers recall
command only copies files from the tape to the cache. Thus, if a file should be retrieved from tape to the Lustre filesystem, the user has to run slk_helper recall
, first, and, afterwards, slk_helpers retrieve
. On the first look, this might make retrievals more complicated in contrast to using the old slk retrieve
. However, as will explained below, these new commands can be used much easier in automated workflows.
The improved version of the command gfbt
generates certain files in a local directory which are quite useful for automated retrieval workflows using the two new slk_helpers
commands from above. This command is needed when more than a handful of files need to be retrieved. Details on these certain files are given further below. There were multiple new parameters added to fine-control gfbt
and the file creation. To simplify the usage, one central new flag -wf1 <retrieval destinationPath>
was introduced (long version: --retrieval-workflow-1 <...>
).
If gfbt -wf1 <...>
is run in a local folder were these certain files or at least one of them already exist, the command will exit with an error. It is also possible to overwrite old files (--overwrite-output
) or to extended existing files (--append-output
).
new scripts#
four new scripts are provided for recalling and retrieving data. Two scripts are the actual work horses:
recall_watcher.sh
retrieve_watcher.sh
and two scripts just start these work horses:
start_recall_watcher.sh
start_retrieve_watcher.sh
Both start_*
scripts check some preconditions and, then, submit their respective work-horse-script as a SLURM job. The work-horse-scripts will submit themselves again delayed when they finish. They write logging information into log files recall.log
and retrieve.log
and their SLURM logs are in a subfolder logs
. These scripts cannot start when slk_helpers gfbt -wf1 <...>
has not been previously run in the same local folder from which these scripts are started because these scripts need files created by gfbt
. A central required file is config.sh
but there are more required files.
new workflow#
The general idea of the workflow is that one recall is run per tape but only one retrieval is run for all files.
Why one recall per tape?#
First, this makes the access of each tape most efficient because each tape has to be accessed only once.
Second, this does not overload the StrongLink system with multiple requests for one tape or multiple requests targeting too many tapes are ones. Details on why this is bad are given here: https://docs.dkrz.de/doc/datastorage/hsm/retrievals.html#aggregate-file-retrievals (TODO).
Why cannot we do this with the existing slk commands?#
Keys points:
slk recall
andslk retrieve
do not accept file list but only a single file, a namespace or a search id. File lists needed to be provided via searches / search ids in the past which might be slow when long file lists should be provided.slk_helpers recall
andslk_helpers retrieve
accept a plain file list and a user can also pipe files into the commands.slk_helpers recall
andslk_helpers retrieve
accept resource ids as input in addition to resource paths which considerably shortens calls of these commands.slk recall
andslk retrieve
are made for interactive use. Non-interactive use in scripts is very complicated and error-prone.slk retrieve
automatically starts recalls for files which are not in the cache. This is practical, when only one file is needed but might cause issues when a large dataset is requested. It does not allow to “get everything what we need and what is currently in the cache”.slk_helpers recall
andslk_helpers retrieve
offer parameters to tell these commands that they only should get files back which are not already existing in a user-provided destinationPath folder.StrongLink has an internal queue for recall jobs.
slk recall
submits a job, prints the job id to the slk log (~/.slk/slk-cli.log
) and waits until the recall job is finished. This is very inefficient because it needs e.g. a SLURM job to run all the time a recall job runs. The same states forslk retrieve
. Instead, the newslk_helpers recall
works likesbatch
: submits a recall job, prints the job id to the terminal quits. A user (or script) can then check viaslk_helpers job_status
whether the job is still running or not.
How does the new workflow look like#
First: DKRZ provides two scripts which do most of the steps described below automatically. A user only needs to run ``slk_helpers gfbt`` and, then, start the two starter scripts.
The users runs first slk_helpers group_files_by_tape
:
slk_helpers group_files_by_tape <source files> -wf1 <local destinationPath> -v
When -v
is set, verbose output is printed. The command might take a bit longer in some situations. Having the verbose option activated might help to keep calm and patient.
The command create multiple files.
config.sh
: various environment variables used for configuring the recall and retrieve watcher scriptsfiles_all.txt
: list of all files to be retrieved; used byretrieve_watcher.sh
files_cached.txt
: list of all cached files; not explicitely usedfiles_multipleTapes.txt
: list of all HPSS files split amongst two tapes; require special treatment; used byrecall_watcher.sh
after all tapes intapes.txt
have been processedfiles_notStored.txt
: list of files which exist as metadata entries but do not have a real file attached; there were a few files archived by HPSS for which this was the casefiles_ignored.txt
: list of files which are ignored for the retrieval because they exist already in thelocal destinationPath
files_tape_<TAPE_BARCODE>.txt
: list of all files which should be recalled from the tape with the givenTAPE_BARCODE
; used byrecall_watcher.sh
tapes.txt
: list of all tapes from which data should be recall; used byrecall_watcher.sh
; a corresponding filefiles_tape_<TAPE_BARCODE>.txt
has to exist for a given tape barcode in this list
Now, we need to run slk_helpers recall
for each tape. First, we get a list of tapes:
$ cat tapes.txt
<tapeBarcode1>
<tapeBarcode2>
<tapeBarcode3>
<tapeBarcode4>
...
Second, we should check whether the tapes are actually avaiable:
$ slk_helpers tape_status <tapeBarcode1>
AVAILABLE
$ slk_helpers tape_status <tapeBarcode2>
BLOCKED
$ slk_helpers tape_status <tapeBarcode3>
ERRORSTATE
$ slk_helpers tape_status <tapeBarcode4>
AVAILABLE
Data can be requested from AVAILABLE
tapes.
BLOCKED
tapes are currently used by other jobs. In order to prevent StrongLink from becoming slow, the new slk_helpers recall
allows only one active job per tape (see for details: https://docs.dkrz.de/doc/datastorage/hsm/retrievals.html#aggregate-file-retrievals). Recall jobs submitted to this tape will fail.
Please inform us when one of your tapes is in an ERRORSTATE
. Commonly, this points to a set warning flag in the metadata or to an inconsistency in the tape metadata. Some of these error states can be solved by the StrongLink admins at DKRZ and others have to be reset by the StrongLink support. Although DKRZ staff resets such tape states regularly or requests it with the StrongLink supprt, some tapes might be overlooked because StrongLink does not allow to search for all tapes in such a state.
Third, we actually run slk_helpers recall
for AVAILABLE
tapes:
$ cat files_tape_<tapeBarcode1>.txt | slk_helpers recall --resource-ids
<jobIdA>
$ cat files_tape_<tapeBarcode4>.txt | slk_helpers recall --resource-ids
<jobIdB>
...
There should not be running more than four recalls at once. For certain tape types, only two should run in parallel because the available number of tape drives is very low. A user could check the running state of a job via slk_helpers job_status
:
$ slk_helpers job_status <jobIdA>
SUCCESSFUL
$ slk_helpers job_status <jobIdB>
PROCESSING
When a job is FAILED
, it should be resubmitted. If it fails multiple times, the DKRZ Beratung should be contacted.
In parallel to running recall processes, the retrieval can run:
cat files_att.txt | slk_helpers recall --resource-ids -ns --destinationPath <local destinationPath> -vv
Since -vv
is set, double verbose output will be printed. This command will retrieve all files which are already in the cache and not available in the local destinationPath
. We recommand setting -ns
which will reconstruct the full archival path in local destinationPath
. This flag is recommanded. It is implicitely assumed by slk_helpers gfbt -wf1 <...>
.
This command can be run repeatedly until all requested files have been retrieved. The command returns exit code of 2
if a general error occurs and exit code 3
if a timeout occurs. As long as at least one file is not cached yet, the commdn returns exit code 1
. This makes the command easily be used in a script / automated workflow:
exit code 0: all files successfully retrieve or already in
local destinationPath
exit code 1: re-run
slk_helpers retrieve
exit code 2: stop retrieving and check the error message
exit code 3: wait a bit and submit
slk_helpers retrieve
again.
usage example#
We want to get CCLM forcing made from ERA5 data for the years 1973, 1974 and 1975. We are in project ab1234
and to to retrieve the data to /work/ab1234/forcing
. Please create a new directory and change into it. There will be many new text files created by the next commands.
Zero, load the appropriate slk module:
module load slk/3.3.91_h1.13.2_w2.0.1
First, we run the gfbt
command for the years 1974 and 1975 only because we forgot that we also need 1974:
$ slk_helpers gfbt -R /arch/pd1309/forcings/reanalyses/ERA5/year1974 /arch/pd1309/forcings/reanalyses/ERA5/year1975 -wf1 /work/ab1234/forcing -v
# command line output is given further low for the interested reader
The output shows a a nice summary from which tapes how many files need to be recalled. We realized that 1973 is missing and simply let gfbt
append the information to the generated files (--apend-output
):
$ slk_helpers gfbt -R /arch/pd1309/forcings/reanalyses/ERA5/year1973 -wf1 /work/ab1234/forcing --append-output -v
# command line output is given further low for the interested reader
Please notify us when you see tapes in ERRORSTATE
.
Now, there should be multiple new files in the current directory. Please remain in this directory and proceed.
Next, we submit the watcher scripts
$ start_recall_watcher.sh ab1234
successfully submitted recall watcher job with SLURM job id '1234567'
$ start_retrieve_watcher.sh ab1234
successfully submitted retrieve watcher job with SLURM job id '1234568'
Check the retrieve.log
and recall.log
files. Check the tapes_error.txt
and files_error.txt
files and report issues to beratung@dkrz.de
Thats it!
Command line output first gfbt
command:
$ slk_helpers gfbt -R /arch/pd1309/forcings/reanalyses/ERA5/year1974 /arch/pd1309/forcings/reanalyses/ERA5/year1975 -wf1 /work/ab1234/forcing -v
progress: generating file grouping based on search id 826348 in preparation
progress: generating file grouping based on search id 826348 (for up to 190 files) started
collection storage information for search id 826348 started
Number of pages with up to 1000 resources per page to iterate: 1
collection storage information for search id 826348 finished
creating and returning object to host resource storage information
progress: generating file grouping based on search id 826348 (for up to 190 files) finished
progress: getting tape infos for 51 tapes started
progress: getting tape infos for 51 tapes finished
progress: extracting tape stati for 51 tapes started
progress: extracting tape stati for 51 tapes finished
------------------------------------------------------------------------------
progress: updating tape infos for 51 tapes started
progress: updating tape infos for 51 tapes finished
progress: extracting tape stati for 51 tapes started
progress: extracting tape stati for 51 tapes finished
------------------------------------------------------------------------------
cached (AVAILABLE ): 23
M24350M8 (BLOCKED ): 2
M24365M8 (AVAILABLE ): 3
M24366M8 (AVAILABLE ): 2
M21306M8 (AVAILABLE ): 2
M21307M8 (AVAILABLE ): 1
M21314M8 (ERRORSTATE): 4
M21315M8 (AVAILABLE ): 1
M24390M8 (AVAILABLE ): 1
M24391M8 (AVAILABLE ): 1
M24280M8 (AVAILABLE ): 3
M21336M8 (AVAILABLE ): 2
M21341M8 (AVAILABLE ): 2
M21344M8 (AVAILABLE ): 1
M21345M8 (AVAILABLE ): 5
M21342M8 (BLOCKED ): 8
M22372M8 (BLOCKED ): 1
M21348M8 (BLOCKED ): 3
M21349M8 (BLOCKED ): 3
M21346M8 (AVAILABLE ): 3
M21347M8 (AVAILABLE ): 2
M21350M8 (AVAILABLE ): 1
M24294M8 (AVAILABLE ): 1
M24295M8 (AVAILABLE ): 1
M24173M8 (AVAILABLE ): 3
M22509M8 (AVAILABLE ): 1
M21360M8 (AVAILABLE ): 1
M21358M8 (AVAILABLE ): 1
M21362M8 (AVAILABLE ): 5
M21363M8 (AVAILABLE ): 3
M32623M8 (AVAILABLE ): 7
M21369M8 (AVAILABLE ): 1
M32621M8 (AVAILABLE ): 7
M32626M8 (AVAILABLE ): 11
M32627M8 (AVAILABLE ): 10
M22395M8 (AVAILABLE ): 4
M32630M8 (ERRORSTATE): 3
M24320M8 (AVAILABLE ): 1
M24321M8 (AVAILABLE ): 3
M32631M8 (AVAILABLE ): 7
M22655M8 (AVAILABLE ): 3
M24324M8 (AVAILABLE ): 1
M32635M8 (AVAILABLE ): 10
M24325M8 (AVAILABLE ): 1
M32632M8 (AVAILABLE ): 8
M22659M8 (AVAILABLE ): 3
M32638M8 (AVAILABLE ): 8
M21385M8 (AVAILABLE ): 1
M32636M8 (AVAILABLE ): 4
M24202M8 (AVAILABLE ): 1
M32640M8 (ERRORSTATE): 4
M32377M8 (AVAILABLE ): 2
------------------------------------------------------------------------------
Command line output second gfbt
command:
progress: generating file grouping based on search id 826349 in preparation
progress: generating file grouping based on search id 826349 (for up to 95 files) started
collection storage information for search id 826349 started
Number of pages with up to 1000 resources per page to iterate: 1
collection storage information for search id 826349 finished
creating and returning object to host resource storage information
progress: generating file grouping based on search id 826349 (for up to 95 files) finished
progress: getting tape infos for 43 tapes started
progress: getting tape infos for 43 tapes finished
progress: extracting tape stati for 43 tapes started
progress: extracting tape stati for 43 tapes finished
------------------------------------------------------------------------------
progress: updating tape infos for 43 tapes started
progress: updating tape infos for 43 tapes finished
progress: extracting tape stati for 43 tapes started
progress: extracting tape stati for 43 tapes finished
------------------------------------------------------------------------------
M24277M8 (AVAILABLE ): 2
M24339M8 (AVAILABLE ): 1
M24280M8 (AVAILABLE ): 3
M21336M8 (AVAILABLE ): 1
M24278M8 (AVAILABLE ): 5
M22422M8 (AVAILABLE ): 1
M24279M8 (AVAILABLE ): 1
M21340M8 (AVAILABLE ): 1
M24221M8 (AVAILABLE ): 1
M21345M8 (AVAILABLE ): 2
M24350M8 (BLOCKED ): 1
M21342M8 (BLOCKED ): 2
M24351M8 (AVAILABLE ): 1
M24223M8 (AVAILABLE ): 1
M22372M8 (BLOCKED ): 5
M21349M8 (AVAILABLE ): 4
M21346M8 (AVAILABLE ): 2
M21347M8 (AVAILABLE ): 2
M21350M8 (AVAILABLE ): 1
M24294M8 (AVAILABLE ): 1
M32016M8 (AVAILABLE ): 1
M24366M8 (AVAILABLE ): 1
M21363M8 (AVAILABLE ): 3
M32623M8 (AVAILABLE ): 5
M21305M8 (AVAILABLE ): 1
M32621M8 (AVAILABLE ): 3
M32626M8 (AVAILABLE ): 5
M32627M8 (AVAILABLE ): 5
M24379M8 (AVAILABLE ): 1
M22395M8 (AVAILABLE ): 2
M32630M8 (ERRORSTATE): 1
M32631M8 (AVAILABLE ): 5
M22655M8 (AVAILABLE ): 3
M32635M8 (AVAILABLE ): 6
M21314M8 (ERRORSTATE): 2
M32632M8 (AVAILABLE ): 1
M32638M8 (AVAILABLE ): 4
M24390M8 (AVAILABLE ): 1
M32636M8 (AVAILABLE ): 2
M24391M8 (AVAILABLE ): 1
M21322M8 (AVAILABLE ): 1
M32640M8 (ERRORSTATE): 2
M32119M8 (AVAILABLE ): 1
------------------------------------------------------------------------------