Archivals to tape#
file version: 22 May 2023
current software versions: slk version 3.3.91; slk_helpers version 1.9.0
Introduction and Summary#
The slk archive
command is available on all nodes of Levante. If you plan to archive a large file (larger than a few GB
) or many files, please do not do this on login nodes but via the partitions compute
, shared
or interactive
. Please allocate 6 GB
of memory (--mem=6GB
). If your slk
is killed with a message like /sw/[...]/bin/slk: line 16: [...] Killed
, then please inform the DKRZ support (support@dkrz.de) and allocate 8 GB
or 10 GB
of memory. If you wish to use slk retrieve
interactively, please start an interactive batch session via the interactive
partition with salloc
as follows (see also Run slk in the “interactive” partition; details on salloc
: Data Processing on Levante):
salloc --mem=6GB --partition=interactive --account=YOUR_PROJECT_ACCOUNT
Warning
Please avoid archiving more than 5 TB
to 10 TB
with one call of slk archive
. Archiving higher amounts of data at once might cause the slk archive
to fail (details: How much data can I archive at once?).
Warning
Archiving too small or too large files causes problems. The optimal file size is between 10 GB and a few 500 GB. Please do not archive many small files but pack them into tar balls or similar. You can use packems for this purpose. It is more efficient and faster to retrieve one file of 1 GB size instead of ten files of 1 MB size. The read rate reaches up to 300 MB/s when data are continuously streamed. The long waiting time for tape retrievals is mainly caused by (a) the time the robot arm needs to transport a tape to the tape drive and (b) the waiting time for empty tape drives. Additionally, the tapes are stressed more by many single reading operations which increases the probability for tape failures. However, when file of 1 TB size and more is archived, then the retrieval will take very long. This is disadvantageous if one only needs a small part of such a large file.
Warning
If the transfer of a file via slk archive
is interrupted, an incomplete version of this file will remain in the HSM. This incomplete file is listed by slk list
, has a size of 0 byte and can be retrieved. We strongly recommend running the same call of slk archive
a second time. This will archive all missing and incomplete files again. In contrast, to what we communicated in the past, it is not sufficient to check the existence of a checksum in StrongLink. Please see What to do when slk archive was interrupted/killed? and Check the integrity / completeness of archived files for details.
Useful information on slk archive#
High memory usage: Please allocate 6 GB of memory for each call of
slk archive
(argument forsbatch
:--mem=6GB
). Otherwise, your commands might be killed by the operating system. If you plan to run three archivals in parallel, please allocate 18 GB – and so on.Check exit codes: Information on the success or failure of
slk archive
will not be printed into the SLURM log automatically. We strongly suggest to check the exit code of eachslk retrieve
and print it to the job log. The variable$?
holds the exit code of the preceding command (see How do I capture exit codes?; example: our example scripts for usage examples).Group of archived files:
slk archive
always sets the group of an archived file to the default group of the archiving user. The group is not set to the group of the target namespace – i.e. the group of the project into which namespace the file is archived into. The group has to be adapted manually.Resume an archival: An exactly equal call of
slk archive
might be run twice if it did not finished properly the first time. The files which have been fully transfer the first time will beskipped
whenslk archive
is run a second time. Partly/incompletely archived files will be copied again. If a file was modified in between the two calls, then it will be copied again, too.slk archive
compares file size and time stamp.Amount of data archived at once: We suggest to archive not more than 5 TB to 10 TB with one call of slk archive (details: How much data can I archive at once?).
What to do when slk archive was interrupted/killed?#
Warning
In the past, we proposed that checking whether a checksum exists in StrongLink was sufficient to determine whether a file had been completely archived or not. This is not the case under some conditions. Please see Check the integrity / completeness of archived files for details.
If the archival of several files was interrupted, please run the same call of slk archive
a second time. The slk archive
will only transfer those files, which
have not already been archived,
have only been partly archived (internally flagged as
partial file
) orhave been modified since the first archival.
Files might be falsely flagged as partial file
under certain conditions described further below in this section.
A call of slk archive
might be interrupted by these – and some other – reasons:
manually killed by the user (e.g. via CTRL + C),
broken ssh connection,
timeout of a SLURM job,
archival of a large amount of data (> 10 TB) while the StrongLink system is under high load or
by the operating system (e.g. allowed memory exceeded).
The files, which were currently transferred when slk archive
was interrupted, are internally flagged as partial file
but will remain in StrongLink until the user takes further actions. Please be aware that these incomplete files are listed by slk list
and may even have a checksum. slk list
should print partial file
for each file with such a flag. However, under certain conditions (a bug) partial file
is not displayed. Therefore, please do not trust the output of slk list
in this context and use slk_helpers has_no_flag_partial <TARGET_PATH> -R -v
to check whether files are flagged as partial
.
After a file has been transfered completely, slk archive
does a quick check of the file. If slk archive
is interrupted after the completion of the file transfer and before the quick check was done, the file is flagged as partial file
although it has been completely archived. Repeated slk archive
calls will skip this file and will not remove the flag. This is another bug which is not fixed yet. If files are skipped in repeated slk archive
calls, please run slk archive -vv ...
and check whether your skipped files are flagged as partial file
(slk_helpers has_no_flag_partial <TARGET_PATH> -R -v
). If this is the case, please notify us via support@dkrz.de so that the flag can be removed. Please tell us if files are repeatedly listed as failed
by slk archive -vv
and not as skipped
.
How much data can I archive at once?#
We suggest to archive not more than 5 TB to 10 TB with one call of slk archive. If you archive more than that and if the StrongLink system is under high load, the transfer might be interrupted unexpectely. The slk
log (~/.slk/slk-cli.log
) will show this error (look for unexpected end of stream on https://archive.dkrz.de/...
):
2022-11-24 11:16:22 INFO Executing command: "archive -R /work/ab1234/c567890/much_data /arch/zy0987/c567890/target
2022-11-24 11:18:25 ERROR Unexpected exception
java.io.IOException: unexpected end of stream on https://archive.dkrz.de/...
at
okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:202)
~[slk-cli-tools-3.3.21.jar:?]
[...]
[...]
[...]
... 16 more
2022-11-24 11:18:25 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 0/85083 files [0B/20.3T]
If you want to keep archiving larger amounts than 10 TB at once – e.g. 100 TB –, please be prepared to run slk archive repeatedly. In the end, a summary similar to this one should be printed to the log:
2022-11-25 11:21:10 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 0/85083 files [0B/20.3T]
Total files skipped: 85083/85083 files [20.3T/20.3T]
Unchanged files: 85083
or this one:
2022-11-25 09:46:55 INFO
Archive report
===============
Status: incomplete
Total files uploaded: 4342/85083 files [1.2B/20.3T]
Total files skipped: 80741/85083 files [20.3T/20.3T]
Unchanged files: 80741
Validate archivals#
For the reason given, we ask you to check if slk archive
finished correctly when doing archivals.
If slk archive
did not finish correctly, we strongly recommend re-running the same slk archive
command or to check all archived files for completeness. Files might be archived incompletely if slk archive
was killed manually, by a timeout of a SLURM job or by a disconnected ssh session. Currently, such files are also displayed by slk list
and, rarely, checksums might be calculated. However, incomplete files are displayed as files with 0 byte size since the StrongLink update in October 2022. If the archived files are very important we recommend to perform a validation / integrity check as described further below.
If slk archive
finished properly – exit code 0
– then it can be assumed that the file(s) was/were archived completely and correctly. This assumption is supported by results of extensive archival-retrieval tests performed. However, bit flips and similar events might occur through which files are corrupted. Therefore, if you archive data, which are very important, please compare the checksums as described below.
Check that slk archive terminated properly#
This can be done to check whether slk archive
finished without an error:
slk archive /path/to/file/to/be/archived /arch/proj/user/test
if [ $? -ne 0 ]; then
>&2 echo "an error occurred in slk archive call"
else
echo "archival successful"
fi
If slk archive
was killed due to a timeout of a SLURM job, an unexpected error, manual user interaction or similar, the archival can be resumed by calling the same slk archive
command a second time (What to do when slk archive was interrupted/killed?). Completely archived files will not be archived again and incomplete files will be overwritten.
Check the integrity / completeness of archived files#
Note
In the past, we proposed that checking whether a checksum exists in StrongLink was sufficient to determine whether a file had been completely archived or not. This is not the case under rare conditions.
There are three ways to check whether a file has been archived completely (and correctly):
Check whether a file is flagged as
partial file
with the commandslk_helpers has_no_flag_partial -v <FILE_PATH>
(validation: check if file flagged as “partial”).Run
slk archive
again. If the file is skipped, it should be correct. Please be aware that bit flips and other bit-wise corrptions are not captured by this.Compare the checksums of the original file with the checksum that StrongLink calculated for the archived file (validation: comparing checksums).
Retrieve the archived file and comparing it against the original file (validation: retrieving the file). Please avoid this currently (01/2023).
Note
Please be aware that a file might be falsely flagged as partial file
although it has been completely archived as described in What to do when slk archive was interrupted/killed?. If a file is listed as partial file by slk_helpers has_no_flag_partial
, please run slk archive
again for this or for all files. If partial
files are skipped by slk archive
they are falsely flagged. Please contact support@dkrz.de in this case.
validation: check if file flagged as “partial”#
Incompletely archived files are flagged as partial file``s. Please use ``slk_helpers has_no_flag_partial -v
to check whether one file or multiple files are flagged as partial
. slk list
should indicate which files are flagged as such but fails to do so in some situations.
$ slk list /dkrz_test/netcdf/example
-rwxr-xr-x- k204221 bm0146 553.9M 19 Jul 2021 02:18 file_500mb_d.nc
-rw-r--r--- k204221 bm0146 553.9M 19 Jul 2021 02:18 file_500mb_e.nc
-rw-r--r--- k204221 bm0146 553.9M 19 Jul 2021 02:18 file_500mb_f.nc (Partial File)
-rw-r--r--- k204221 bm0146 554.0M 19 Jul 2021 02:18 file_500mb_g.nc (Partial File)
Files: 4
The Partial File
is not displayed if the file was moved or renamed or if the permissions, group or owner of the file where changed. This is known slk
bug. Please run slk_helpers has_no_flag_partial
to check whether one or multiple files are flagged as partial
:
$ slk_helpers has_no_flag_partial /dkrz_test/netcdf/20230504c -R -v
/dkrz_test/netcdf/20230504c/file_500mb_d.nc has partial flag
/dkrz_test/netcdf/20230504c/file_500mb_f.nc has partial flag
/dkrz_test/netcdf/20230504c/file_500mb_g.nc has partial flag
Number of files without partial flag: 7/10
Note
Please be aware that a file might be falsely flagged as partial file
although it has been completely archived as described in What to do when slk archive was interrupted/killed?. If a file is listed as partial file by slk_helpers has_no_flag_partial -v
, please run slk archive
again for this or for all files. If partial
files are skipped by slk archive
they are falsely flagged. Please contact support@dkrz.de in this case.
validation: comparing checksums#
StrongLink calculates two types of checksums for files: sha512 and adler32. It might take a few hours after the archival until the checksums are calculated. If no checksum is available a day after the archival finished and the file size is larger than 0 byte, please contact support@dkrz.de.
The checksums from StrongLink are obtained via slk_helpers checksum RESOURCE
. The sha512
checksum of a local file is calculated via sha512sum
.
# archive a file
$ slk archive test.nc /arch/bm0146/k204221/test_data
[========================================\] 100% complete. Files archived: 1/1, [1.7K/1.7K].
# wait some hours ...
# calculated the checksum of the local file
$ sha512sum test.nc
22ef50dcbd179775b5a6e632b02d8b99ddf16609f342a66c1fae818ed42a49d5a33af3dd8e059fa7a743f5b615620f2ad87a3d01bf3e2e0cde0e8a607bc1f15d test.nc
# get the checksum of the archived file
$ slk_helpers checksum -t sha512 /arch/bm0146/k204221/test_data/test.nc
22ef50dcbd179775b5a6e632b02d8b99ddf16609f342a66c1fae818ed42a49d5a33af3dd8e059fa7a743f5b615620f2ad87a3d01bf3e2e0cde0e8a607bc1f15d
validation: retrieving the file#
please avoid doing this until mid-2023 when more tape drives will become available
When we want to be as sure as possible that a file is correctly archived on tape, we need to wait until the file is written onto tape and retrieve it from there. After a file has been written onto tape, it remains in the HSM cache for a few hours or days. As long as the file is in the HSM cache, it will be retrieved from there instead from tape. Therefore, we need to wait some time until the file version in the HSM cache is deleted. Then we can run slk retrieve
and get the file from tape.
Note
Files of a size of a few MB and below are not deleted from the HSM cache and will remain there forever.
# archive a file
$ slk archive test.nc /arch/bm0146/k204221/test_data
[========================================\] 100% complete. Files archived: 1/1, [1.7K/1.7K].
# wait some days ...
# check if file is still in the HSM cache
$ slk_helpers iscached /arch/bm0146/k204221/test_data/test.nc
File is cached
# still cached; wait more time
# wait some days ...
# check again whether file is in the cache
$ slk_helpers iscached /arch/bm0146/k204221/test_data/test.nc
File is not cached
# now, we retrieve the archived file
$ slk retrieve /arch/bm0146/k204221/test_data/test.nc compare/
...
# compare the two files; e.g. via their checksums and also get the checksum from StrongLink
$ sha512sum test.nc
22ef50dcbd179775b5a6e632b02d8b99ddf16609f342a66c1fae818ed42a49d5a33af3dd8e059fa7a743f5b615620f2ad87a3d01bf3e2e0cde0e8a607bc1f15d test.nc
$ sha512sum compare/test.nc
22ef50dcbd179775b5a6e632b02d8b99ddf16609f342a66c1fae818ed42a49d5a33af3dd8e059fa7a743f5b615620f2ad87a3d01bf3e2e0cde0e8a607bc1f15d test.nc
$ slk_helpers checksum -t sha512 /arch/bm0146/k204221/test_data/test.nc
22ef50dcbd179775b5a6e632b02d8b99ddf16609f342a66c1fae818ed42a49d5a33af3dd8e059fa7a743f5b615620f2ad87a3d01bf3e2e0cde0e8a607bc1f15d
Archival wrapper for SLURM#
We will soon provide slk wrapper scripts on Levante to simplify the submission of archival jobs on Levante.
Archival script templates#
Several script templates for different use cases are printed below and available for download:
several archivals of single files:
archive_slurm_template_single_files.sh
archival of one file and checksum check:
archive_slurm_template_single_file_with_checksum_check.sh
andarchive_slurm_template_get_and_compare_checksum.sh
When you use these templates, you need to make a few adaptions (not each script has all of them):
modify
src_folder
: replace/work/xz1234/ex/am/ple
by the actual source folder on the lustre file systemmodify
target_folder
: replace/arch/xz1234/${USER}/ex/am/ple
by something appropriate for you projectreplace
xz1234
in--account=xz1234
by your project account name (in all relevant scripts)modify
src_file
: replacefile.nc
by a correct filenamemodify:
rtrv_folder
: replace/scratch/${USER:0:1}/${USER}/validation
by a temporary target folder for your validation retrievals
Please run/submit these scripts via sbatch
as described in Run slk as batch job and SLURM Introduction.
several archivals of single files#
#!/bin/bash
# HINT:
# * You can change the values right of the "=" as you wish.
# * The "%j" in the log file names means that the job id will be inserted
#SBATCH --job-name=test_slk_arch_job # Specify job name
#SBATCH --output=test_job.o%j # name for standard output log file
#SBATCH --error=test_job.e%j # name for standard error output log
#SBATCH --partition=shared # partition name
#SBATCH --ntasks=1 # max. number of tasks to be invoked
#SBATCH --time=08:00:00 # Set a limit on the total run time
#SBATCH --account=xz1234 # Charge resources on this project
#SBATCH --mem=6GB
# make 'module' available when script is submitted from certain environments
source /sw/etc/profile.levante
# ~~~~~~~~~~~~ preparation ~~~~~~~~~~~~
module load slk
# set the source folder
src_folder=/work/xz1234/elp/ma/xe
# set target folder for archival
target_folder=/arch/xz1234/${USER}/ex/am/ple
# ~~~~~~~~~~~~ archivals ~~~~~~~~~~~~
# do the archival
echo "doing 'slk archive'"
# ~~~~~~~~~~~~ doing single-file archivals ~~~~~~~~~~~~
# You can do multiple archivals in one script. The exit code of each
# archival should be captured afterwards (get $? in line after slk command)
slk archive /work/xz1234/elp/ma/xe/file01.nc ${target_folder}
if [ $? -ne 0 ]; then
>&2 echo "an error occurred in slk archive call 1"
else
echo "archival 1 successful"
fi
# second archival and capture exit code (get $? in line after slk cmd)
slk archive /work/xz1234/elp/ma/xe/file02.nc ${target_folder}
if [ $? -ne 0 ]; then
>&2 echo "an error occurred in slk archive call 2"
else
echo "archival 2 successful"
fi
# ...
# ...
# fifteenth archival and capture exit code (get $? in line after slk cmd)
slk archive /work/xz1234/elp/ma/xe/file15.nc ${target_folder}
if [ $? -ne 0 ]; then
>&2 echo "an error occurred in slk archive call 15"
else
echo "archival 15 successful"
fi
archival of one file with delayed checksum check#
This template/example consists of two files:
archival (also starts the second script):
archive_slurm_template_single_file_with_checksum_check.sh
get and compare checksums:
archive_slurm_template_get_and_compare_checksum.sh
archive_slurm_template_single_file_with_checksum_check.sh#
#!/bin/bash
# HINT:
# * You can change the values right of the "=" as you wish.
# * The "%j" in the log file names means that the job id will be inserted
#SBATCH --job-name=test_slk_arch_job # Specify job name
#SBATCH --output=test_job.o%j # name for standard output log file
#SBATCH --error=test_job.e%j # name for standard error output log
#SBATCH --partition=shared # partition name
#SBATCH --ntasks=1 # max. number of tasks to be invoked
#SBATCH --time=08:00:00 # Set a limit on the total run time
#SBATCH --account=xz1234 # Charge resources on this project
#SBATCH --mem=6GB
# make 'module' available when script is submitted from certain environments
source /sw/etc/profile.levante
# ~~~~~~~~~~~~ preparation ~~~~~~~~~~~~
module load slk
# set the source folder
src_folder=/work/xz1234/elp/ma/xe
src_file=file.nc
# set target folder for archival
target_folder=/arch/xz1234/${USER}/ex/am/ple
# set a file to write the result of the checksum comparison into
checksum_result_file=${src_folder}/${src_file}.chk
# ~~~~~~~~~~~~ archivals ~~~~~~~~~~~~
# do the archival
echo "doing 'slk archive'"
# We run the archival and capture the exit code ...
slk archive ${src_folder}/${src_file} ${target_folder}
if [ $? -ne 0 ]; then
>&2 echo "an error occurred in slk archive call"
exit 1
else
echo "archival successful"
fi
# ... then we calculate the checksum and ...
checksum_src_file_raw=`sha512sum ${src_folder}/${src_file}`
if [ $? -ne 0 ]; then
>&2 echo "checksum could not be calculated"
exit 1
else
echo "calculation of checksum successful: ${checksum_src_file_raw}"
fi
echo $checksum_src_file_raw > ${src_folder}/${src_file}.sha512
# ... submit a delayed job for retrieving the checksum from StrongLink
sbatch --begin="now+2hours" ./archive_slurm_template_get_and_compare_checksum.sh ${src_folder}/${src_file}.sha512 ${target_folder}/${src_file} ${checksum_result_file}
archive_slurm_template_get_and_compare_checksum.sh#
#!/bin/bash
# HINT:
# * You can change the values right of the "=" as you wish.
# * The "%j" in the log file names means that the job id will be inserted
#SBATCH --job-name=test_slk_checksum # Specify job name
#SBATCH --output=test_job.o%j # name for standard output log file
#SBATCH --error=test_job.e%j # name for standard error output log
#SBATCH --partition=shared # partition name
#SBATCH --ntasks=1 # max. number of tasks to be invoked
#SBATCH --time=08:00:00 # Set a limit on the total run time
#SBATCH --account=xz1234 # Charge resources on this project
#SBATCH --mem=6GB
# make 'module' available when script is submitted from certain environments
source /sw/etc/profile.levante
# ~~~~~~~~~~~~ get and print arguments ~~~~~~~~~~~~
if [ "$#" -ne 3 ]; then
echo -1
>&2 echo "need three input argument (got $#): FILE_CONTAINING_CHECKSUM_OF_SRC_FILE RESOURCE_PATH_HSM CHECKSUM_COMPARISON_RESULT_FILE"
exit 1
fi
checksum_file=$1
resource_path_hsm=$2
checksum_result_file=$3
echo "~~~ got this input: ~~~"
echo "checksum_file: ${checksum_file}"
echo "resource_path_hsm: ${resource_path_hsm}"
echo "checksum_result_file: ${checksum_result_file}"
# ~~~~~~~~~~~~ preparation ~~~~~~~~~~~~
module load slk
# ~~~~~~~~~~~~ get source file's checksum ~~~~~~~~~~~~
if [ ! -f ${checksum_file} ]; then
>&2 echo "file containing the checksum of the source file does not exist: '${checksum_file}'"
exit 1
fi
checksum_src_file_raw=`cat ${checksum_file}`
checksum_src_file=`echo ${checksum_src_file_raw} | awk '{ print $1 }'`
# ~~~~~~~~~~~~ check if HSM file is available ~~~~~~~~~~~~
# first we check whether the resource/file actually exists in the HSM
echo "doing 'slk_helpers exists'"
slk_helpers exists ${resource_path_hsm}
exit_code=$?
if [ $exit_code -ne 0 ]; then
if [ $exit_code -eq 1 ]; then
>&2 echo "file '${resource_path_hsm}'; stop obtaining a checksum"
exit 1
else
>&2 echo "an unknown error occurred in 'slk_helpers exists ${resource_path_hsm}' call; exit code: ${exit_code}"
exit 1
fi
else
echo "file exists in HSM ('$resource_path_hsm')"
fi
# ~~~~~~~~~~~~ get HSM checksum ~~~~~~~~~~~~
echo "doing 'slk_helpers checksum -t sha512'"
# We first run the archival and capture the exit code ...
checksum_hsm_file_raw=`slk_helpers checksum -t sha512 ${resource_path_hsm}`
exit_code=$?
if [ $exit_code -ne 0 ]; then
if [ $exit_code -eq 1 ]; then
echo "checksum of '${resource_path_hsm}' not yet calculated by StrongLink; resumitting this job"
sbatch --begin="now+2hours" ${0} ${checksum_src_file} ${resource_path_hsm} ${checksum_result_file}
exit 0
else
>&2 echo "an error occurred in slk_helpers checksum call; exit code: ${exit_code}"
exit 1
fi
else
echo "getting checksum successful"
fi
checksum_hsm_file=`echo ${checksum_hsm_file_raw} | awk '{ print $1 }'`
# ~~~~~~~~~~~~ compare if checksums are equal ~~~~~~~~~~~~
echo "Result of checksum comparison will be written into ${checksum_result_file} (first line: 0 == checksums equal; 1 == checksums differ)"
if [ "${checksum_src_file}" = "${checksum_hsm_file}" ]; then
echo "checksums are equal: ${checksum_src_file}"
exit_code=0
else
echo "checksums are unequal: ${checksum_src_file} and ${checksum_hsm_file}"
exit_code=1
fi
echo "${exit_code}" > ${checksum_result_file}
echo "# 0 == checksums equal; 1 == checksums differ)" >> ${checksum_result_file}
echo "checksum src file: ${checksum_src_file_raw}" >> ${checksum_result_file}
echo "checksum HSM file: ${checksum_hsm_file} ${resource_path_hsm}" >> ${checksum_result_file}
exit ${exit_code}