Known Issues (read this!)

file version: 19 April 2022

current slk version: 3.3.21

slk issues on Levante

We are experiancing two issues on Levante that affect the usage of slk. You cannot do anything about them but just run slk again a few seconds later.

slk/slk_helpers terminate directly after start with “Name or service not known” or “Unhandled error occurred”

The error, which the `slk_helpers print to the command line is

archive.dkrz.de: Name or service not known

The error, which slk prints to the command line is

ERROR: Unhandled error occurred, please check logs

However, if you take a look into the slk log (~/.slk/slk-cli.log) then you’ll find the same error message as the slk_helpers display:

2022-04-07 21:45:30 ERROR archive.dkrz.de: Name or service not known

The error is the same. The slk seams to have issues getting the proper routing to the StrongLink constellation.

slk terminates while running with “A fatal error has … Java …” and “SIGBUS”

This SIGBUS-related error does not only occurr when slk is used but also when other programs are called on Levante. ATOS is investigating it.

slk archive/retrieve may use much memory and CPU time – careful with parallel slk calls on one node

slk archive and slk retrieve are fast but memory hungry. Additionally, these commands run many threads in parallel when large files or many files are transferred. Running many of such slk calls in parallel on one node as one user (a) uses up a lot of memory and (b) causes issues in the thread management. As a rule of thumb, 6 to 8 GB of memory should be assumed for each slk call. However, that exact memory need depends on the amount of data that are archived / retrieved by slk.

Be aware that on most levante nodes, 2 GB of memory is allocated to each physical CPU core. This limited is enforced by the operating systems and processes exceeding the allowed memory usage will be killed. Thus, one should at least request three physical CPU cores for major archival or retrieval tasks on Lenvate.

slk is hanging / unresponsive

A slk call runs for some time (even a few hours) and nothing seems to happen. There are several possible reasons for this.

lustre file system is hanging

Please check whether /home is hanging. If /home is hanging, slk cannot access its login token and cannot write into its log. Therefore, slk hangs when /home is hanging.

slk retrieve does not hang but the tape recall takes very long

When many retrieve/recall requests of files from tape are processed, the individual calls of slk retrieve might take longer than normal.

one or more source files have 0 byte size

Please check whether you are archiving a file of 0 Byte size. slk archive and slk retrieve hang when such a file is archive or retrieved, respectively.

slk retrieve hangs due to read-write access on one tape

When data is written from the HSM-cache to a tape and a slk retrieve call targets a file on the same tape then the slk retrieve calls hangs until it is killed. StrongLink does not write onto HPSS tapes. Thus, this issue can only arise if files are retrieved, which (a) have been archived after 1 November 2021 or (b) have been retrieved at least once since 1 November 2021. The latter is the case because files retrieved via slk from HPSS tapes are automatically written to new StrongLink tapes.

Running commands without -R

Non-recursiveness is interpreted differently in StrongLink than defined in POSIX. If a namespace/directory (not a file) is given as input to the commands slk archive, slk retrieve and slk tag, all files in this namespace/directory are affected. In contrast, cp and rm would throw an error that -r is missing. When -R is set, all sub-namespaces are also affected.

slk writes no output in non-interactive mode

All slk commands except for slk list do not print output to the stdout and stderr streams (== command line output) when they are in non-interactive mode – i.e. running in SLURM jobs. Please catch the exit codes of your slk archive call and check whether they are equal 0. If not, an error occurred. Details on the error can be found in the slk log file ~/.slk/slk-cli.log. However, when you run many slk commands in parallel, the slk log becomes hard to read. Please print the time stamp (i.e. via date) when the error occurred to be able to find the details in the slk log later on. See the next code block on how to do this.

The exit code of the previous program call is stored in $?. Example:

$ slk archive /work/project/user/data /ex/am/ple/blub
...
$ echo $?
0
# or 1 or higher

In a bash/batch script it could look like this:

# ...
slk archive /work/project/user/data /ex/am/ple/blub
exit_code=$?

# print exit code with prefix so that it is easy to `grep`
echo "exit code: ${exit_code}"
if [ ${exit_code} -ne 0 ]; then
    #  print date
    date
fi

slk never writes to stderr

Error output of slk is written to the stdout stream instead of the stderr stream. If slk output in non-interactive mode was activated (it is not!) then you would find all error output in the SLURM stdout (not stderr) file when running jobs on mistral.

difference: slk move and slk rename

The Linux mv can move and rename files. The slk move can just move files/namespaces from one namespace to another namespace. Renaming can only be performed by slk rename. Both commands can only target one file/namespace at a time. Wildcards are not supported.

slk archive compares file size and timestamp prior to overwriting files

slk archive compares file size and timestamp to decide whether to overwrite a file or not. rsync does it the same way. There might be rare situations when an archived file should be overwritten by another file with the same name, size and timestamp: this would fail.

Availability of archived data and modified metadata might be delayed by a few seconds

StrongLink is a distributed system. Metadata is stored in a distributed metadata database. Some operations might take a few seconds until their results are visible because they have to be synchronized amongst different nodes.

Please wait a few seconds before you retrieve a file that was just archived.

A file listed by slk list is not necessarily available for retrieval yet

The location, name and size of a file are metadata. These metadata are written into the StrongLink metadata database when an archival process starts. slk list only prints metadata. Hence, if slk list lists a file, which is e.g. part of a file set currently uploaded in a batch job, this file is not necessarily fully uploaded yet. Similarly, aborted slk archive calls can produce a file’s metadata entry without correct data. Such a file can be retrieved without error. Please see failed or canceled slk archive and slk retrieve calls leave file fragments for details on file fragments.

failed or canceled slk archive and slk retrieve calls leave file fragments

issues during archival

A file fragment remains in StrongLink if slk archive did not terminate properly during an archival process. Metadata is available for this file fragment and it can be retrieved. It has no checksum. The latter is because some metadata – like checksums – will be written after the archival process has finished successfully. The existence of checksums can be checked via slk_helpers checksum GNS_PATH. In the case of netCDF files, the header section might be copied properly. Thus, an ncdump -h might be successfully applied on a file fragment.

These fragments might occur when a user aborts slk archive (CTRL + C), a ssh connection breaks or a SLURM job is killed due to a timeout. More than one file might be affected because multiple files can be archived in parallel.

issues during retrieval

If slk retrieve does not terminate properly during a retrieval process, a file fragment might be created. These file fragments of temporary file names containing the original FILENAME: ~FILENAME14620203101828317173.slkretrieve. The reasons for improper termination of slk retrieve are the same as for slk archive. More than one file might be affected because multiple files can be retrieved in parallel.

Commonly, a file was correctly retrieved when it has its original filename and when the exit code of slk retrieve is 0 (echo $? directly after retrieval). To be 100% sure that the files was correctly retrieved, you can compared the checksum of the retrieved file with the checksum stored in StrongLink. If there is no checksum stored in StrongLink, the source file already is incomplete.

Pagination mode of slk list

When slk list is used in interactive mode without piping its output into another command, it will print its output in “pagination mode”. This means that only 25 results are printed “per page” and the user has to “turn the page” manually by pressing Return/Enter. Turning a page back is not possible. Even if there are less than 25 result, pagination mode is entered and the user has to type Return/Enter to leave the pagination mode. When a user regularly leaves the pagination mode, the terminal is cleared as CTRL + L does. This behavior is by design and cannot be changed. If one wants to avoid the terminal to be cleared or does not want to browse through 30 pages, one should abort slk list with CTRL + C. We recommend to use slk list in combination with cat, less, more or similar tools in order to avoid the pagination mode. Below you will find an example.

Please note that the output of slk list NAMESPACE and slk list NAMESPACE | cat differs in the last line. This might be important when you create scripts around slk list.

slk list in pagination mode:

$ slk list /k204221_test
drwxrwxrwx- k204221     bm0146                 24 Jun 2021  20210624_test
drwxrwxrwx- k204221     bm0146                 25 Jun 2021  20210625_test
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  abc
drwxrwxrwx- k204221     bm0146                 24 Jun 2021  blubber
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  defg
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  memory_issue_testing
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  sbds_test_data
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  sbds_test_data_b
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  test
drwxrwxrwx- k204221     ka1209                 22 Jun 2021  test_20210617
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  test_20210622
drwxrwxrwx- k204221     ka1209                 22 Jun 2021  testing
Files 1-12 of 12

Avoid pagination mode of slk list:

$ slk list /k204221_test | cat
drwxrwxrwx- k204221     bm0146                 24 Jun 2021  20210624_test
drwxrwxrwx- k204221     bm0146                 25 Jun 2021  20210625_test
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  abc
drwxrwxrwx- k204221     bm0146                 24 Jun 2021  blubber
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  defg
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  memory_issue_testing
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  sbds_test_data
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  sbds_test_data_b
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  test
drwxrwxrwx- k204221     ka1209                 22 Jun 2021  test_20210617
drwxrwxrwx- k204221     bm0146                 22 Jun 2021  test_20210622
drwxrwxrwx- k204221     ka1209                 22 Jun 2021  testing
Files: 12

slk tag cannot be applied on individual files

slk tag cannot be applied on individual files but only on namespaces. If it is applied on a namespace, all files in this namespace are assigned the metadata provided in the slk tag call. The namespace itself does not get any metadata assigned. If -R is set, also all files in sub-namespaces are assigned the metadata.

slk does not have a –version flag

Instead, it has a version command: slk version

Update interval of progress bars (slk archive, group, owner, retrieve, tag)

Progress bars are updated per file or per block of n files. If you archive a folder with three files of 99 GB, 550 MB and 450 MB size, you will not see any updates of the progress bar 99% of the archival time while the large 99 GB file is archived and the progress bar will jump from 0% to 99%. If you tag a few files, the process bar will remain at 0% for a long time and suddenly jump to 100%.

Using slk list to print search results

slk list prints only the file names – independent on whether we print the content of a namespace or the result of a search. However, a search might find files in arbitrary namespaces. Thus, it would be helpful to print the path/namespace of each file when search results are listed. This is not the case. Currently, you cannot find out in which namespace(s) your search results are located in.

slk performance on different node types

We suggest running slk archive and slk retrieve on the mistralpp and compute/compute2 nodes. The run time on the mistralpp nodes considerably depends on the activity of other users on these nodes.

Please do not run slk archive and retrieve on the mistral login nodes (mlogin10X) when you archive large amounts of data because slk causes high CPU load and uses much memory.

The available memory per job on the shared nodes is very low. Therefore, slk archive and slk retrieve are slower than on other nodes. The run time can be expected to be two to four times as long as on the mistralpp and compute/compute2 nodes.

group memberships of user updated on login

If a user is added to a new group/project, this information is not automatically passed to StrongLink. Instead, the user has to run slk login again. Background: StrongLink caches LDAP data of each user and only updates its cache on a new login.

slk retrieve does not overwrite files but creates duplicates

When a file already exists, it retrieves a copy and inserts .DUPLICATE_FILENAME.[ID].[VERSION] between name and extension of the file. However, slk retrieve will overwrite these DUPLICATE files without warning. Consecutive retrievals will overwrite this file even if it is modified.

VERSION indicates the file version in StrongLink. If you modify a file and archive it a second time, the version will be incremented by one. Commonly, the version is not visible to you. Old file versions are not kept. Metadata of old versions is partly kept.

Do not archive such a DUPLICATE file because it might overwrite itself during retrieval.

“slk retrieve /source/ /target” and “slk retrieve /source /target” are not the same

slk retrieve works the same as rsync with respect to a / appended to the source path.

With / appended to the source path:

$ ls /ex/am/ple/bm0146/k20422/dm/retrieve_us
test.txt

$ slk retrieve -R /ex/am/ple/bm0146/k20422/dm/retrieve_us/ .
...

$ ls .
test.txt

Without / in the end of the source path:

$ ls /ex/am/ple/bm0146/k20422/dm/retrieve_us
test.txt

$ slk retrieve -R /ex/am/ple/bm0146/k20422/dm/retrieve_us .
...

$ ls .
retrieve_us

$ ls ./retrieve_us
test.txt

slk group does not print visible error messages when they fail

Short version

The progress bar of slk group does not properly print the full number of files to modify. There will be always printed Files changed: n-1/n or Files changed: n/n with increasing n over time. When the slk group call stops working due to an internal error, the user does not know when the currently printed number of modified files n is the number of all available files. Hence, it is important either to capture the exit code of slk group or to have a look into the slk log (~/.slk/slk-cli.log) afterwards.

Long Version

When slk group are recursively applied to a folder with many files in it, the slk commands already start modifying first files while StrongLink is still collecting files. The progress bar will show 99% to 100% during the whole time while the file count will raise:

$ slk group -R 200524 /ex/am/ple/bm0146/k20422/dm/group_example
[========================================|] 100% complete. Files changed: 10/10, [150M/150M].
[========================================|] 100% complete. Files changed: 11/11, [152M/152M].
[========================================|] 100% complete. Files changed: 19/19, [214M/214M].
...

If some file cannot be modified, this is indicated as follows:

$ slk group -R 200524 /ex/am/ple/bm0146/k20422/dm/group_example
[========================================|] 100% complete. Files changed: 15426/15583, [7.9T/8.0T]. Files failed: 157.

But, when slk group finishes we do not know if all possible files were modified or if slk group was stopped in between (see next example):

$ slk group -R 200524 /k204221_test/testing/stability_20211012_size_500mb_40
[========================================|] 100% complete. Files changed: 15426/15583, [7.9T/8.0T]. Files failed: 157.
$ slk group -R 200524 /k204221_test/testing/stability_20211012_size_500mb_40
[=======================================/] 100% complete. Files changed: 31204/31227, [16.0T/16.1T]. Files failed: 157.

Both slk group were applied on the same folder. Therefore, the number of modified files should be the same – but, it is not. The reason for this discrepancy is that the first slk group command stopped with exit code 1 after 15583 files. Hence, it is important either to capture the exit code of slk group or to have a look into the slk log (~/.slk/slk-cli.log) afterwards.

slk archive might create namespaces with “.” and “..” as names but slk retrieve interpretes them

. and .. will be considered as normal names of namespaces in StrongLink. slk move and slk rename prevent the usage of . and .. (and moving into these). However, slk archive does not prevent this yet. The examples below should clarify this.

When namespaces with names . and .. are retrieved, these names are interpreted by the shell.

# create source data
$ mkdir none dot
$ echo "none" > none/a.txt
$ echo "." > dot/a.txt

# archival
$ slk archive none/a.txt /ex/am/ple/
[========================================\] 100% complete. Files archived: 1/1, [5B/5B].
$ slk archive dot/a.txt /ex/am/ple/.
[========================================-] 100% complete. Files archived: 1/1, [2B/2B].

# see what was archived
$ slk list /ex/am/ple | cat
drwxrwx---- stronglink  group0                 10 Nov 2021  .
-rw-r--r--- stronglink  group0             5   10 Nov 2021  a.txt
Files: 2
$ slk list /ex/am/ple/. | cat
-rw-r--r--- stronglink  group0             2   10 Nov 2021  a.txt

# retrieve top folder recursively
$ slk retrieve -R /ex/am/ple retr_overwrite_20211109_a
[========================================|] 100% complete. Files retrieved: 2/2, [7B/7B].

# check what is there
$ ls -la retr_overwrite_20211109_a/overwrite_20211109_a/
total 9
drwxr-xr-x 2 k204221 bm0146 4096 Nov 10 00:12 .
drwxr-xr-x 3 k204221 bm0146 4096 Nov 10 00:12 ..
-rw------- 1 k204221 bm0146    2 Nov 10 00:12 a.DUPLICATE_FILENAME.52933184010.1.txt
-rw------- 1 k204221 bm0146    5 Nov 10 00:12 a.txt

slk bad_input returns exit code 0

slk BAD_INPUT (like slk acrhvie) prints the help and returns a 0 as exit codes. It is said to print exit code 0 because the help is printed successfully. However, it should be 1 or higher.

slk cannot handle a path with // (double slash)

slk does not substitute // by /. Instead, it creates or looks for a namespace with an empty string as name (// => / + empty string + /). Empty strings as names for namespaces are prohibited. Therefore, commands fail when there is a // in a file path.

Filtering slk list results with “*”

use * to replace parts of the file name

This works fine:

$ slk list /ex/am/ple/\*.nc
...

$ slk list '/ex/am/ple/*.nc'
...

The user needs to prevent that * is interpreted by the bash/ksh/… . This can be done by one of both approaches above.

escape * to print the content of a namespace containing * in its name

Assuming, we have a namespaces with the name *, which is allowed, then we might do this to its content:

$ slk list '/ex/am/ple/\*'
...

This will prevent slk list successfully from interpreting the *. However, when a * is in the path, slk list automatically goes into “filter mode”. This means that the content of the namespace /ex/am/ple will be filtered for content with the name *. Hence, we will just get * printed and not its content.

using * to replace parts of namespace names

Using * to replace parts of the names of namespaces does not work. Example:

$ slk list /ex/am/ple/\*/\*.nc
...

$ slk list '/ex/am/ple/*/*.nc'
...

These two list commands will look for *.nc in /ex/am/ple and not in every sub-namespace of /ex/am/ple.

slk chmod -R modifies many more file permissions than it should

slk chmod -R creates a tree of all files and of all namespaces in which these files are located. slk chmod -R seems to iterated the tree in a wrong way so that each files’ permissions are not modified once but 2^[namespace_depth - 1] times.

example 1

$ echo "abc" > test.txt

$ slk archive test.txt /ex/am/ple/ex1/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z
[========================================/] 100% complete. Files archived: 1/1, [...].

# that's OK
$ slk chmod -R 755 /ex/am/ple/ex1/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z
[========================================\] 100% complete. Files changed: 1/1, [4B/4B].

# that's not OK
$ slk chmod -R 755 /ex/am/ple/ex1
^C                          ^C===========\] 100% complete. Files changed: 4431/4431, [...].

example 2

# archive five files into one parent parent
$ slk archive *.nc /ex/am/ple/ex2a
[========================================/] 100% complete. Files archived: 5/5, [10.8K/10.8K].
$ slk chmod -R 755 /ex/am/ple/ex2a
[========================================\] 100% complete. Files changed: 6/6, [10.8K/10.8K].
# ================>>>>>>>>>>>> SIX RESOURCES MODIFIED <<<<<<<<<<<<================

# archive five files into three sub-namespaces:
$ slk archive *.nc /ex/am/ple/ex2b/d1/d2/d3
[========================================-] 100% complete. Files archived: 5/5, [10.8K/10.8K].
$ slk chmod -R 755 /ex/am/ple/ex2b
[========================================|] 100% complete. Files changed: 55/55, [86.3K/86.3K].
# ================>>>>>>>>>>>> FIFTY-FIVE RESOURCES MODIFIED <<<<<<<<<<<<================

example 3

echo "abc" > test.txt

# no subfolder; n=0
slk archive test.txt /ex/am/ple/test01
slk chmod -R /ex/am/ple/test01
# => 2 resources (1x file, 1x namespace)

# subfolder; n=1
slk archive test.txt /ex/am/ple/test01/test02
slk chmod -R /ex/am/ple/test01
# => 5 resources (2x same file, 3x namespaces: 2x test02, 1x test01)

# subfolder in subfolder; n=2
slk archive test.txt /ex/am/ple/test01/test02/test03
slk chmod -R /ex/am/ple/test01
# 11 resources (4x same file, 7x namespaces: 4xtest03, 2x test02, 1x test01)

# subfolder in subfolder in subfolder; n=3
slk archive test.txt /ex/am/ple/test01/test02/test03/test04
slk chmod -R /ex/am/ple/test01
# 23 resources (8x same file, 15x namespaces: 8xtest04, 4xtest03, 2x test02, 1x test01)

# ... n ...
...
# 2^n * FILES + 2^(n+1) - 1 resources => 2^n times each file;2^(n+1)-1 namesspaces

How to search non-recursively in a namespace

slk search cannot search non-recursively in a namespace provided via path. As workaround, please get the object id of the particular namespace via slk_helpers exists and, then in your search query, use it as value for the search field resources.parent_id (see slk Usage Examples)

Terminal cursor disappears if slk command with progress bar is canceled

If a slk command with a progress bar is canceled by the user, the shell cursor might disappear. One can make it re-appear by (a) running reset or (b) starting vim and leaving it directly (:q!).

error “conflict with jdk/…” when the slk module is loaded

slk needs a specific Java version that is automatically loaded with slk. Having other Java versions loaded in parallel might cause unwanted side effects. Therefore, the system throws an error message and aborts.

slk needs a specific Java version

You might encounter an error like this:

$ slk list 12
CLI tools require Java 13 (found 1)

slk needs a specific Java version. This Java version is automatically loaded when we load the slk module. If you have another Java loaded explicitly, please unload them prior to loading the slk module. If you loaded slk already, please: (1) unload slk, (2) unload all Java modules and (3) load slk again. h

slk search yields RQL parse error

ERROR: Search failed. Reason: RQL parse error: No period found in collection field name ().

Either: Please consider using ' around your search query instead of " to prevent operators starting with $ to be evaluated as bash variables.

Or: Please escape $’s belonging to query operators when you use " as delimiters of the query string.

slk login asks me to provide a hostname and/or a domain

If you are asked for this information the configuration is faulty. Please contact support@dkrz.de and tell us on which machine you are working.

Archival fails and Java NullPointerException in the log

This error message is printed in the log:

2021-07-13 08:33:03 ERROR Unexpected exception
java.lang.NullPointerException: null
    at com.stronglink.slkcli.api.websocket.NodeThreadPools.getBestPool(NodeThreadPools.kt:28) ~[slk-cli-tools-3.1.62.jar:?]
    at com.stronglink.slkcli.archive.Archive.upload(Archive.kt:191) ~[slk-cli-tools-3.1.62.jar:?]
    at com.stronglink.slkcli.archive.Archive.uploadResource(Archive.kt:165) ~[slk-cli-tools-3.1.62.jar:?]
    at com.stronglink.slkcli.archive.Archive.archive(Archive.kt:77) [slk-cli-tools-3.1.62.jar:?]
    at com.stronglink.slkcli.SlkCliMain.run(SlkCliMain.kt:169) [slk-cli-tools-3.1.62.jar:?]
    at com.stronglink.slkcli.SlkCliMainKt.main(SlkCliMain.kt:103) [slk-cli-tools-3.1.62.jar:?]
2021-07-13 08:33:03 INFO

This error indicates that there is an API issue. A reason might be that one or more StrongLink nodes went offline and the other nodes did not take of their connections yet. Please notify support@dkrz.de if you experience this error.

slk login ERROR: Unhandled error occurred, please check logs

The error message printed in the log starts with:

2022-03-25 14:39:50 ERROR No transformation found: class io.ktor.utils.io.ByteBufferChannel -> ...
status: 200 OK
response headers:
...

When the error occurrs

You run slk login, misspell your password on the first try and provide the correct password on the second or later try.

Solving the error

Run slk login a second time.

permissions of retrieved files are “rw——-” although umask is set differently

slk retrieve ignores umask and partly ignores ACLs (via setfacl). Instead, it always sets rw-------.

While slk retrieve is copying a file, nobody should interact with this file. Therefore, no read/write/execute permissions are granted to other users than the owner. After the retrieval is finished, the permissions should be updated according to umask and ACLs. However, this is not done.

slk archive: Exception …: lateinit property websocket has not been initialized

Full error message on the command line:

Exception in thread "Thread-357" kotlin.UninitializedPropertyAccessException: lateinit property websocket has not been initialized
at com.stronglink.slkcli.queue.ArchiveWebsocketWorker.closeConnection(ArchiveWebsocketWorker.kt:146)
at com.stronglink.slkcli.queue.WebsocketWorker.run(WebsocketWorker.kt:67)

Error message in the log:

2022-03-01 13:50:28 ERROR Error in websocket worker
java.util.concurrent.CompletionException: java.net.http.WebSocketHandshakeException
        at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367) ~[?:?]

Reason

Probably, slk archive was run with --streams 10 or a similar high number like --streams 16 or --streams 32

Solution

Please use slk archive --streams N with a maximum value of 4 for N. Transfer rates of 1 to 2 GB/s are possible with this configuration when the system is not busy.

slk archive runs infinitely when files of 0 byte size are archived

If we archive files with slk archive and at least one file has a size of 0 Byte then slk archive will archive all files but will not quit. Instead, it will run infinitely. When run in a batch job, it is not possible to determine whether slk archive is still archiving or whether it is hanging due to a 0 Byte file.

slk delete failed, but nevertheless file was deleted

Issue description

We run slk delete /abc/def/ghi.txt but slk delete fails due to an unknown reason. Repeated calls of slk delete /abc/def/ghi.txt fail because the target file does not exist anymore.

Reason

The reason has not been fully identified yet. This one is the most probable reason: When slk delete sends a deletion request to StrongLink, it waits a certain time for the response of the StrongLink instance to return. If this does not happen or if the reply of another confirmation step does not return in time (= timeout), slk assumes that the command failed.

Solution

Please carefully check, if files were actually deleted when a slk delete did finish successfully.