Speedup Retrievals with Striping¶
file version: 20 April 2022
current slk version: 3.3.21
Overview¶
Note
On mistral, striping is mandatory for retrieval of files > 10 GB to sustain high lustre I/O performance for all users. We recommend to use striping on levante for the time being. This recommendation will change at some point in future.
On a lustre file system, the content of files is served from so called object storage targets (OSTs). Lustre allows to configure per file or per folder onto how many OSTs a file’s content is split. This configuration option is denoted as striping
. striping=1
means that the file’s content is served by on OST only. striping=4
(for example) means that the file’s content is split into four distinct parts which are served by four OSTs – one part per OST. On the lustre file system, which is connected to Mistral, all files are by default stored with striping
set to 1
. The default configuration on the lustre file system of levante has still to be decided. The general advantages and disadvantages of striping are not described here. There are numerous online sources on this topic.
On mistral: Striping is mandatory. The lustre file system will probably slow down or start hanging when several large files (> 10 GB) are retrieved via slk
with striping set to 1
. This will affect many other users working on mistral at the same time. Therefore, we strongly urge you to use striping for files > 10 GB. We suggest to set striping to 8
. It might be even reasonable to use striping for small files when many files are retrieved at once. The striping factor has to be set prior to file retrieval. It has to be set on folder level and not on file level. Setting it on file level does not work because files get a temporary non-predictable name during the retrieval and are renamed after successful retrieval.
On levante: Striping is recommended. We do not expect that the lustre file system of levante will slow down as the lustre file system of mistral. Nevertheless, we recommend using striping for the time being. When you set up striping, it is highly recommended not to use a constant striping factor but set up progressive file layout (= automatic striping based on file size).
Note
The lustre file system of mistral does not support progressive file layout. Therefore, a constant striping factor has to be used on mistral.
Set and Check Striping¶
Striping has to be set prior to file creation via one of these two commands:
# ON MISTRAL (constant striping factor)
lfs setstripe -S 4M -c 8 TARGET
# all files have striping factor of 8
# ON LEVANTE (progressive file layout)
lfs setstripe -E 1G -c 1 -S 1M -E 4G -c 4 -S 1M -E -1 -c 8 -S 1M TARGET
# file size < 1 GB => striping factor of 1
# file size < 4 GB => striping factor OF 4
# files size >= 4 GB;
TARGET
can be either the soon-to-be created/retrieved file or a directory in/into which files are created/retrieved. The command cannot be applied on files that exist already. If the command is applied on a directory that contains files then the old files’ striping remains unchanged but new files are striped as defined.
The striping of a file can be checked via:
lfs getstripe -d TARGET
Example¶
We have an archived file /arch/ab01234/c567890/test01.nc
and create three local directories retr01
, retr02
and retr03
. The example file only has a size of 105MB. In real applications, the file should have a size of approximately at least 50 GB.
No Striping¶
We will retrieve the archived file into retr01
without setting any striping and check the striping afterwards.
$ mkdir retr01
$ slk retrieve /arch/ab01234/c567890/test01.nc retr01
[========================================|] 100% complete. Files retrieved: 1/1, [105.5M/105.5M].
$ lfs getstripe -d retr_01/test01.nc
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 34
obdidx objid objid group
34 25319089 0x18256b1 0
Set Striping for a file¶
We will retrieve the archived file into retr02
and set striping to 8
prior to retrieval. Afterwards, we check the striping.
$ mkdir retr02
$ lfs setstripe -S 4M -c 8 retr_02/test01.nc
$ slk retrieve /arch/ab01234/c567890/test01.nc retr02
[========================================|] 100% complete. Files retrieved: 1/1, [105.5M/105.5M].
$ lfs getstripe -d retr_02/test01.nc
lmm_stripe_count: 8
lmm_stripe_size: 4194304
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 95
obdidx objid objid group
95 25298851 0x18207a3 0
62 25301879 0x1821377 0
82 27774549 0x1a7ce55 0
2 28747311 0x1b6a62f 0
79 25442734 0x18439ae 0
15 28451203 0x1b22183 0
118 20014121 0x1316429 0
119 19860330 0x12f0b6a 0
Set Striping with a constant striping factor for a directory¶
First, we create a file test02.txt
in retr03
. Second, we set striping of the directory retr03
to 8
. Finally, we retrieve the file and check the striping.
$ mkdir retr03
$ echo "abc" > retr03/test02.txt
$ lfs setstripe -S 4M -c 8 retr_03
$ slk retrieve /arch/ab01234/c567890/test01.nc retr03
[========================================|] 100% complete. Files retrieved: 1/1, [105.5M/105.5M].
$ lfs getstripe -d retr03/test01.nc
lmm_stripe_count: 8
lmm_stripe_size: 4194304
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 80
obdidx objid objid group
80 27491641 0x1a37d39 0
31 25444333 0x1843fed 0
64 25053678 0x17e49ee 0
61 25355939 0x182e6a3 0
6 28884651 0x1b8beab 0
111 24873595 0x17b8a7b 0
79 25442733 0x18439ad 0
73 25332380 0x1828a9c 0
$ lfs getstripe -d retr03/test02.txt
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 26
obdidx objid objid group
26 25574671 0x1863d0f 0
Set Striping with progressive file layout for a directory¶
First, we create a file test03.txt
in retr04
. Second, we set striping of the directory retr04
to a progressive file layout. Finally, we retrieve the file and check the striping.
$ mkdir retr04
$ echo "abc" > retr04/test03.txt
$ lfs setstripe -E 1G -c 1 -S 1M -E 4G -c 4 -S 1M -E -1 -c 8 -S 1M retr_04
$ slk retrieve /arch/ab01234/c567890/test01.nc retr04
[========================================|] 100% complete. Files retrieved: 1/1, [105.5M/105.5M].
$ lfs getstripe -d retr03/test01.nc
lcm_layout_gen: 0
lcm_mirror_count: 1
lcm_entry_count: 3
lcme_id: N/A
lcme_mirror_id: N/A
lcme_flags: 0
lcme_extent.e_start: 0
lcme_extent.e_end: 1073741824
stripe_count: 1 stripe_size: 1048576 pattern: raid0 stripe_offset: -1
lcme_id: N/A
lcme_mirror_id: N/A
lcme_flags: 0
lcme_extent.e_start: 1073741824
lcme_extent.e_end: 4294967296
stripe_count: 4 stripe_size: 1048576 pattern: raid0 stripe_offset: -1
lcme_id: N/A
lcme_mirror_id: N/A
lcme_flags: 0
lcme_extent.e_start: 4294967296
lcme_extent.e_end: EOF
stripe_count: 8 stripe_size: 1048576 pattern: raid0 stripe_offset: -1
$ lfs getstripe -d retr04/test03.txt
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 26
obdidx objid objid group
26 25574671 0x1863d0f 0