Data Transfer#

file version: 27 Jun 2026

Warning

The Versity HSM system is not active yet. We expect it will be online on 6 July 2026.

Overview#

The Versity HSM System at DKRZ offers four tools/interfaces to interact with the system:

  • ATLAS WebGUI or command line interface (acli)

  • read-only mounts /arch, /double and /doku on Levante login and interactive nodes (not on compute nodes)

  • Versity S3 Gateway (not available the first days after migration)

ATLAS is meant as the central tool for data transfer between tape archive and Lustre filesystem. Using ATLAS requires a login with your default DKRZ credentials. Keys and secrets for the Versity S3 Gateway will be made available via LUV as soon as the Versity S3 Gateway is available to you.

If you are looking for a quick guide for the migration from slk and/or slk_helpers to new commands of the new software, please have a look here.

Transfer: When to use which tool?#

Levante/Lustre <-> Tape: ATLAS#

The default tool for transferring data between the Lustre filesystem and the DKRZ tape archive is ATLAS. The cli acli and the ATLAS WebGUI offer nearly the same set of features. acli can be used interactively but also in scripts. You need to login with your default DKRZ user credentials.

S3 <-> Tape: Versity S3 Gateway#

When you want to transfer data between the DKRZ S3 system and the DKRZ tape archive, we recommend using the Versity S3 endpoint with a tool that allows S3-to-S3 transfers like rclone. You need to generate an S3 key and secret via LUV.

Look into one file from tape: Mount Point#

When you want to peek into a single file or process it, you can use the read only mounts /arch, /double and /doku to access it. Please stage (“recall” in StrongLink) the file in advance via acli stage <file_path>.

Outside world <-> Tape: depends#

When you want to transfer data between the outside world and DKRZ tape archive, please use the DKRZ S3 storage as intermediate step. If you need to transfer multiple TB between DKRZ and another side once, we are able to set up an S3 endpoint which is open to the world. For the reasons stated above, this will only be done in a very few cases for the time being. Additionally, we will elaborate on the usage of Globus Transfer incombination with the Versity system once we migrated successfully and resolved major teething problems.

File transfer via ATLAS#

ATLAS offers three modes for file transfer:

  • cp (recommended; “archival” or “retrieval”):

    • transfer a file within the HSM system or between HSM system and the Lustre filesystem (/scratch: both directions; /work: read-only)

    • ATLAS handles the transfer for you

    • acli/WebGUI can be exited after transfer was started

    • check jobs in ATLAS to get the status of running file transfers

    • the cp command requires absolute paths

  • upload (”archival”):

    • transfer a file which is available on your “local” filesystem to the HSM system

    • acli/WebGUI has to run while the transfer is ongoing

  • download (”retrieval”):

    • transfer a file from the HSM system to your “local” filesystem

    • acli/WebGUI has to run while the transfer is ongoing

The transfer via acli cp / WebGUI’s copy is the recommended transfer mode. It is much more efficient than upload and download and should only be used for very specific use cases.

Examples#

Copy a folder recursively from /work into the archive.

$ acli cp --recursive /work/ka1209/k204221/output_42 /hsm/arch/ka1209/k204221/

Create a destination folder and copy a two files from /scratch into the archive.

$ acli mkdir /hsm/arch/ka1209/single_files
$ acli cp /scratch/k/k204221/file_01.txt /work/k/k204221/file_02.txt /hsm/arch/ka1209/single_files/

Retrieve files from the archive to /scratch.

$ acli cp --recursive /hsm/arch/ka1209/k204221/output_42 /scratch/k/k204221

Cache, Online, Offline, Stage, Release!?#

Storage of data on tape is cheap compared to storage on HDD, SSD or similar. Therefore, the DKRZ data archive uses tapes for data storage. However, accessing data from tape is relatively slow compared to other storage types.

If you request a file from tape, it is copied to a faster storage within the HSM system, denoted as cache, and then transfered to you. Files remain in the cache for a limited time before they are removed from it. As long as a file is cached, you can access it very quickly. If not, you have to wait some time (10 to 15 GB/Minute + overhead for spooling and transport to tape drive).

If you archive a file, it is written into the cache and, later, from there to tape. “Later” in this context can be minutes or hours depending on the load on the system and the availability of tape drives. This process cannot be controlled by you. A file cannot be removed from the cache if it has not been written to tape.

In Versity, a file is denoted as online when it is cached. Otherwise it is offline. The process of copying files into the cache is denoted as staging. You can manually tell Versity to remove a file from the cache by releasing it. Commonly, you do not need to do this because the cache is cleaned up automatically quite well.

If you copy files via ATLAS to the Lustre filesystem, they are automatically staged. However, if you access files via one of the mount points such as /arch you have to stage them manually by

acli stage <FILE>

The cleanup of the cache starts, when the fill state reaches an upper threshold. Then, oldest files are removed first – based on the last access time and the last staging time – until the fill state reaches a lower threshold. Thus, the acli stage command also acts like a touch. There is no guarantee that a file remains cached for X hours after running acli stage. Please do not run command permanently for all of your files in a loop. If all users were doing this … .

We recommend running the staging command every time before you access files via one of the mount points. Running the staging command resets the timestamp, which is relevant for the cache cleanup routines, and thus increases the lifetime of targeted file in the cache.

File access via /hsm/arch mount point#

There are three mounts /hsm/arch, /hsm/double and /hsm/doku available on all Levante login nodes. In most cases, you are will be interested in /arch (see Storage options, quota and file size for details). All mounts are read-only. This means that you can retrieve/access archived data but cannot archive new data via the mounts. The mounts are not thought for major data transfer but rather to enable you to have a quick look into a few files without the need to retrieve them to /work or somewhere else.

If you want to access a file via the /hsm/arch mount point, the file has to be online. Accessing an offline file will result in an error. Therefore, the file has to be staged in advance by

acli stage <FILE>

We are aware that this is not the most convenient approach. However, this is the approach which the Versity support recommends based on experiance at other customers’ sites.

File transfer via S3#

If you need to transfer data between Lustre filesystem and tape archive, please use ATLAS. However, if data should be transfered between a S3 bucket and the tape archive, the Versity S3 Gateway is suited best.

The archival of data can be done with a S3 client of your choice. Retrieval of files requires a S3 tool which supports the Amazon Glacier command set. Details will be published later on.

File transfer from/to outside DKRZ#

Please contact us via support@dkrz.de