Getting started#
file version: 16 Jun 2026
Warning
The Versity HSM system is not active yet. The StrongLink system will be replaced by Versity (Software) in the end of June 2026. We expect the Versity will be online on 6 July 2026.
Overview#
The Versity HSM System at DKRZ offers four tools/interfaces to interact with the system:
read-only mounts
/arch,/doubleand/dokuon Levante login and interactive nodes (not on compute nodes)ATLAS WegGUI
ATLAS command line interface (acli)
Versity S3 Gateway (not available the first days after migration)
ATLAS is meant as the central tool for data transfer between tape archive and Lustre filesystem. Using ATLAS requires a login with your default DKRZ credentials. Keys and secrets for the Versity S3 Gateway will be made available via LUV as soon as the Versity S3 Gateway is available to you.
Beyond providing new tools and interfaces, key features of the new HSM software by Versity are:
automatic packing of small files: the process is not visible to users
background data transfer via ATLAS: no need for users to run long SLURM job for archivals or retrievals
ACLs exist: as file owner, set read/write/execute permissions for individual other users
very efficient internal organization of the tape access: no need to sort retrieval requests by tape
If you are looking for a quick guide for the migration from slk
and/or slk_helpers to new commands of the new software, please have
a look here.
Transfer: When to use which tool?#
Levante/Lustre <-> Tape: ATLAS#
The default tool for transferring data between the Lustre filesystem and
the DKRZ tape archive is ATLAS. The cli acli and the ATLAS WebGUI
offer nearly the same set of features. acli can be used
interactively but also in scripts. You need to login with your default
DKRZ user credentials.
S3 <-> Tape: Versity S3 Gateway#
When you want to transfer data between the DKRZ S3 system and the DKRZ
tape archive, we recommend using the Versity S3 endpoint with a tool
that allows S3-to-S3 transfers like rclone. You need to generate an
S3 key and secret via LUV.
Look into one file from tape: Mount Point#
When you want to peek into a single file or process it, you can use the
read only mounts /arch, /double and /doku to access it.
Please stage (“recall” in StrongLink) the file in advance via
acli stage <file_path>.
Outside world <-> Tape: depends#
When you want to transfer data between the outside world and DKRZ tape archive, please use the DKRZ S3 storage as intermediate step. If you need to transfer multiple TB between DKRZ and another side once, we are able to set up an S3 endpoint which is open to the world. For the reasons stated above, this will only be done in a very few cases for the time being. Additionally, we will elaborate on the usage of Globus Transfer incombination with the Versity system once we migrated successfully and resolved major teething problems.
File transfer via ATLAS#
ATLAS offers three modes for file transfer:
cp (recommended; “archival” or “retrieval”):
transfer a file within the HSM system or between HSM system and the Lustre filesystem (
/scratch: both directions;/work: read-only)ATLAS handles the transfer for you
acli/WebGUI can be exited after transfer was started
check
jobsin ATLAS to get the status of running file transfersthe
cpcommand requires absolute paths
upload (”archival”):
transfer a file which is available on your “local” filesystem to the HSM system
acli/WebGUI has to run while the transfer is ongoing
download (”retrieval”):
transfer a file from the HSM system to your “local” filesystem
acli/WebGUI has to run while the transfer is ongoing
The transfer via acli cp / WebGUI’s copy is the recommended transfer
mode. It is much more efficient than upload and download and
should only be used for very specific use cases.
Examples#
Copy a folder recursively from /work into the archive.
$ acli cp --recursive /work/ka1209/k204221/output_42 /arch/ka1209/k204221/
Create a destination folder and copy a two files from /scratch into the archive.
$ acli mkdir /arch/ka1209/single_files
$ acli cp /scratch/k/k204221/file_01.txt /work/k/k204221/file_02.txt /arch/ka1209/single_files/
Retrieve files from the archive to /scratch.
$ acli cp --recursive /arch/ka1209/k204221/output_42 /scratch/k/k204221
Cache, Online, Offline, Stage, Release!?#
Storage of data on tape is cheap compared to storage on HDD, SSD or similar. Therefore, the DKRZ data archive uses tapes for data storage. However, accessing data from tape is relatively slow compared to other storage types.
If you request a file from tape, it is copied to a faster storage within the HSM system, denoted as cache, and then transfered to you. Files remain in the cache for a limited time before they are removed from it. As long as a file is cached, you can access it very quickly. If not, you have to wait some time (10 to 15 GB/Minute + overhead for spooling and transport to tape drive).
If you archive a file, it is written into the cache and, later, from there to tape. “Later” in this context can be minutes or hours depending on the load on the system and the availability of tape drives. This process cannot be controlled by you. A file cannot be removed from the cache if it has not been written to tape.
In Versity, a file is denoted as online when it is cached. Otherwise it is offline. The process of copying files into the cache is denoted as staging. You can manually tell Versity to remove a file from the cache by releasing it. Commonly, you do not need to do this because the cache is cleaned up automatically quite well.
If you copy files via ATLAS to the Lustre filesystem, they are automatically
staged. However, if you access files via one of the mount points such as
/arch you have to stage them manually by
acli stage <FILE>
The cleanup of the cache starts, when the fill state reaches an upper threshold. Then, oldest files are removed first – based on the last
access time and the last staging time – until the fill state reaches a lower threshold. Thus, the acli stage command
also acts like a touch. There is no guarantee that a file remains cached for X hours after running acli stage. Please do not run command permanently for all of your files in a loop. If all users were doing this … .
We recommend running the staging command every time before you access files via one of the mount points. Running the staging command resets the timestamp, which is relevant for the cache cleanup routines, and thus increases the lifetime of targeted file in the cache.
File access via /arch mount point#
There are three mounts /arch, /double and /doku available on all Levante login nodes. In most cases, you are will be interested in /arch (see Storage options, quota and file size for details). All mounts are read-only. This means that you can retrieve/access archived data but cannot archive new data via the mounts. The mounts are not thought for major data transfer but rather to enable you to have a quick look into a few files without the need to retrieve them to /work or somewhere else.
If you want to access a file via the /arch mount point, the file has to be online. Accessing an offline file will result in an error. Therefore, the file has to be staged in advance by
acli stage <FILE>
We are aware that this is not the most convenient approach. However, this is the approach which the Versity support recommends based on experiance at other customers’ sites.
File transfer via S3#
If you need to transfer data between Lustre filesystem and tape archive, please use ATLAS. However, if data should be transfered between a S3 bucket and the tape archive, the Versity S3 Gateway is suited best.
The archival of data can be done with a S3 client of your choice. Retrieval of files requires a S3 tool which supports the Amazon Glacier command set. Details will be published later on.
File transfer from/to outside DKRZ#
Please contact us via support@dkrz.de
Checksum verification#
Versity verifies a checksum of each file after each copy process within the HSM system – e.g. after staging a file from tape into the cache. These checksums are not available to the user. The acli cp command does not verify checksums.
One checksum which is visible to users can be stored per file. Versity does not calculate it automatically and does not prescribe a checksum type. E.g. if your model output workflow includes verification of output data via md5 checksums, then you can ask Versity to generate and/or verify based md5 checksums:
$ acli checksum -g -t md5 /arch/ka1209/test.txt
/arch/ka1209/test.txt: MD5:1efdc3f67c25c52e10551c7e655013ef
Checksums can only be calculated for files which are online. Offline files need to be staged manuelly before checksums can be calculated.
An alternative verification workflow is to prescribe the checksum and ask Versity to verify it.
# calculate checksum locally
$ md5sum /work/ka1209/test.txt
78b7249583a98ddc2ac8425f9b4a9498 /work/ka1209/tast.txt
# set and verify checksum
$ acli checksum --set 78b7249583a98ddc2ac8425f9b4a9498 --type md5 --verify /arch/ka1209/tast.txt
/arch/ka1209/tast.txt: md5 checksum set
$ echo $?
0
# set wrong checksum and verify
$ acli checksum --set 78b7249583a98ddc2ac8425f9b4a9498 --type md5 --verify /arch/ka1209/test.txt
/arch/ka1209/test.txt: set checksum: set checksum: scoutam: set checksum inode 4895745: rpc error: code = Unknown desc = verify user checksum atlas/arch/bm0146/acli_3.4.5-20260615060213-8985bbe8_linux_amd64.rpm: checksum mismatch: 78b7249583a98ddc2ac8425f9b4a9498 != 1efdc3f67c25c52e10551c7e655013ef
Error: 1 path(s) failed
$ echo $?
1
# remove a checksum
$ acli checksum -c /arch/ka1209/tast.txt
Retention#
Versity allows to manually set a retention time per file. A file cannot be deleted within the retention time. Thus, the retention time can protect a file from being accidentally deleted. The retention time can also be removed manually.
Python and Versity#
Work in progress
Useful Scripts#
Work in progress