HSM News for Dec 2023#
Warning
This blog entry is outdated.
Hohoho,
(frei nach “Knecht Ruprecht” von Theodor Storm)
Table of contents#
new version slk_helpers (1.10.2): verify size of archived files and many other small new features
new version slk wrappers (1.2.1): start automatized weekly file size verifications and daily checks of the expire date of the slk login token
new version pyslk:
pyslk
has not been updated but the current release works with the newslk_helpers
cleanup and renaming of slk modules: please switch to
slk/3.3.91_h1.10.2_w1.2.1
; cleanup of old modules
new version slk_helpers (1.10.2)#
We have release a new slk_helpers
version 1.10.2. The major changes with respect to version 1.9.x are:
slk_helpers
allow to run verify jobs which check whether archived files are smaller than expected- important new commands:
is_admin_session
: Check if the use is currently logged in as normal user or admin userjob_report
: fetch raw verify job report; please useresult_verify_job
instead if possibleprint_rcrs
: print size and checksums of file parts; some HPSS files are stored as two parts on two tapesresult_verify_job
: list relevant errors found by a verify jobsearch_incomplete
: Prints whether the search is incomplete (still running)search_successful
: Prints whether the search was successfulsubmit_verify_job
: run a verify job for a provided set of files
new job states:
BLOCKED
,PAUSED
,STOPPED
,WAITING
,OTHER
extended command
size
by new parameters: *-R
/--recursive
for requesting the size of the content of namespaces recursively *--pad-spaces-left
for space padding to the left in order to align file/namespaces sizes when the command is called multiple timesimproved handling of connection timeouts of StrongLink
- changes which might break established workflows:
slk_helpers size
does not return exit code2
anymore when namespaces are targeted but0
as exit code and0
as size if-R
/--recursive
is not set.
The module with the new slk_helpers
version is slk/3.3.91_h1.10.2_w1.2.1
. I is not set as default module yet but will be on 18 Dec. pyslk
has not been updated yet but existing functions will work with the new slk_helpers
version.
See also
Archivals to tape - Verify file size (outdated)
StrongLink verify jobs (outdated)
new version slk wrappers (1.2.1)#
The slk wrapper collection has been updated to version 1.2.1
higher stability with respect to timeouts of StrongLink
improved error and warning output
- two new
slk
wrapper scripts: slk_wrapper_daily_login_check
: starts SLURM jobs which daily check whether the user’s login token is still valid; if the token is due to expire, an email is send to the userslk_wrapper_weekly_verify_job
: starts SLURM jobs which weekly run a verify job for all files of the user in the HSM/StrongLink cache; sends a summary email after each job finished
- two new
Note
If the StrongLink system is under high load, timeout errors might occur. The new slk_helpers
and slk_wrappers
version have been made more robust with respect to these errors. However, the wrappers still might fail due to timeouts.
See also
example on using the wrapper script in combination with
slk_helpers group_files_by_tape
new version pyslk#
No new pyslk
version is released. The current pyslk
version (1.9.5) works with the new version of the slk_helpers
except that no wrappers exists for new helpers commands.
See also
cleanup and renaming of slk modules#
many modules containing old
slk
/slk_helpers
versions have been removed- modules which will be removed end of Dec 2023:
slk/3.3.91
slk_helpers/1.9.9
slk_helpers/1.9.7
slk_helpers/1.9.6
slk_helpers/1.9.5
- new combined module for
slk
,slk_helpers
andslk wrappers
: slk/<SLK_VERSION>_h<SLK_HELPERS_VERSION>_w<SLK_WRAPPERS_VERSION>
current default module:
slk/3.3.91_h1.9.7_w1.0.0
next default module:
slk/3.3.91_h1.10.2_w1.2.1
- alternative modules (please shift to
slk/3.3.91_h1.10.2_w1.2.1
as soon as possible) slk/3.3.91_h1.10.1_w1.2.0
slk/3.3.91_h1.10.2_w1.2.0
- alternative modules (please shift to
- new combined module for
file size verification#
The size of files can be verified by verify jobs
. This is particularly useful to identify files which have been archived incompletely. These jobs can only target files, which are stored in the HSM cache. Files, which have already been written to tape, passed an interal file size verification and are not tested.
Please start a verify job as follows:
$ slk_helpers submit_verify_job /dkrz_test/netcdf/20230925a -R
Submitting up to 1 verify job(s) based on results of search id 576002:
search results: pages 1 to 1 of 1; visible search results: 10; submitted verify job: 176395
Number of submitted verify jobs: 1
A verify job with the id 176395
was submitted. It is in the same queue as recall jobs are. Thus, if many files are recalled and the StrongLink queue is well filled, verify jobs might need to wait some time until they are processed. The job status is checked as follows:
$ slk_helpers job_status 176395
PROCESSING
# wait a few seconds or minutes ...
$ slk_helpers job_status 176395
COMPLETED
The results of the verify job can be fetched via slk_helpers result_verify_job
:
$ slk_helpers result_verify_job 176395
Errors:
Resource content size does not match record: /dkrz_test/netcdf/20230925a/file_001gb_b.nc
Resource content size does not match record: /dkrz_test/netcdf/20230925a/file_001gb_c.nc
Resource content size does not match record: /dkrz_test/netcdf/20230925a/file_001gb_a.nc
Resource content size does not match record: /dkrz_test/netcdf/20230925a/file_001gb_f.nc
Erroneous files: 4
Four size-mismatch errors were detected. The this case, these files should be re-archived or deleted from the archive.
See also
Archivals to tape - Verify file size (outdated)
Reference: StrongLink verify jobs (outdated)
Useful links#
Archivals to tape - Verify file size (outdated)
Reference: StrongLink verify jobs (outdated)